[PATCH] D143787: [X86] Try to use `{v}shufps` instead of `vpermilps` for common float shuffles.

Wed Feb 15 02:07:40 PST 2023

RKSimon added a comment.

We have a number of cases where a specific instruction is faster on one target than another, or there's no domain switch cost and we can use smaller variants, etc.

This comes to mind as well: https://github.com/llvm/llvm-project/issues/43458

We can use tuning flags, but for many cases its just confusing and matching what our scheduler models already tell us. Plus we end up with many permutations of DAG / isel that don't always work well together, or cause infinite loops etc.

My idea was for a small pass similar to FixupLEA/FixBWI that we can drive from a mixture of the subtarget tuning flags and the scheduler model to decide between various equivalent instruction options based on cost estimates. Shuffle ops give themselves to this the most, but theres probably others we might consider as well.

Replacing a single instruction for another is trivial - it might be feasible to replace a single instruction for multiple instructions as well (HADD/HSUB expansion, Funnel Shifts, etc.).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143787/new/

https://reviews.llvm.org/D143787