[PATCH] D140811: [DAGCombiner][X86] `visitVECTOR_SHUFFLE()`: splats with a single non-undef element are not splats

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 9 17:49:47 PST 2023


lebedev.ri added a subscriber: tstellar.
lebedev.ri marked an inline comment as done.
lebedev.ri added inline comments.


================
Comment at: llvm/test/CodeGen/X86/horizontal-sum.ll:66
+; AVX2-SLOW-NEXT:    vinsertps {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[0]
 ; AVX2-SLOW-NEXT:    retq
   %5 = shufflevector <4 x float> %0, <4 x float> poison, <2 x i32> <i32 0, i32 2>
----------------
RKSimon wrote:
> regression - we've gone from 3hops to 4hops + extra shuffles
Ok, i'll start with this one i guess.
It also seems reasonably straight-forward,
at least as the first step we need to go from
```
    t23: v2f32 = vector_shuffle<1,u> t21, undef:v2f32
  t24: v2f32 = fadd t21, t23
    t33: v2f32 = vector_shuffle<1,u> t32, undef:v2f32
  t34: v2f32 = fadd t32, t33
t75: v4f32 = concat_vectors t24, t34
```
to (pseudocode)
```
i0: v4f32 = concat_vectors t21, t32
i1: v4f32 = vector_shuffle<1,u,3,u> i0, undef:v4f32
i2: v4f32 = fadd i1, i0
```
I'm guessing *just* folding `concat_vectors` of identical opcodes
to a single opcode of multiple concat_vectors may not be a win though,
and shuffles must be matched too. Not sure yet.

But, i'm getting mixed signals here.
@RKSimon Should this kind of straight-forward yak shaving be just committed, or submitted to phab first?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140811/new/

https://reviews.llvm.org/D140811



More information about the llvm-commits mailing list