[PATCH] D140811: [DAGCombiner][X86] `visitVECTOR_SHUFFLE()`: splats with a single non-undef element are not splats
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 9 17:49:47 PST 2023
lebedev.ri added a subscriber: tstellar.
lebedev.ri marked an inline comment as done.
lebedev.ri added inline comments.
================
Comment at: llvm/test/CodeGen/X86/horizontal-sum.ll:66
+; AVX2-SLOW-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[0]
; AVX2-SLOW-NEXT: retq
%5 = shufflevector <4 x float> %0, <4 x float> poison, <2 x i32> <i32 0, i32 2>
----------------
RKSimon wrote:
> regression - we've gone from 3hops to 4hops + extra shuffles
Ok, i'll start with this one i guess.
It also seems reasonably straight-forward,
at least as the first step we need to go from
```
t23: v2f32 = vector_shuffle<1,u> t21, undef:v2f32
t24: v2f32 = fadd t21, t23
t33: v2f32 = vector_shuffle<1,u> t32, undef:v2f32
t34: v2f32 = fadd t32, t33
t75: v4f32 = concat_vectors t24, t34
```
to (pseudocode)
```
i0: v4f32 = concat_vectors t21, t32
i1: v4f32 = vector_shuffle<1,u,3,u> i0, undef:v4f32
i2: v4f32 = fadd i1, i0
```
I'm guessing *just* folding `concat_vectors` of identical opcodes
to a single opcode of multiple concat_vectors may not be a win though,
and shuffles must be matched too. Not sure yet.
But, i'm getting mixed signals here.
@RKSimon Should this kind of straight-forward yak shaving be just committed, or submitted to phab first?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140811/new/
https://reviews.llvm.org/D140811
More information about the llvm-commits
mailing list