[llvm] [SROA] Only try additional vector type candidates when needed (PR #77678)
Craig Topper via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 7 15:12:30 PST 2024
topperc wrote:
I'm seeing a regression after this patch.
Before this patch
```
Rewriting alloca partition [0,16) to: %cond.sroa.0 = alloca <2 x i64>, align 16
rewriting [0,16) slice #0
Begin:(0, 16) NewBegin:(0, 16) NewAllocaBegin:(0, 16)
original: %2 = load <4 x i32>, ptr %m5, align 8, !tbaa !8
to: %2 = bitcast <2 x i64> %cond.sroa.0.0.load to <4 x i32>
rewriting [0,8) slice #1
Begin:(0, 8) NewBegin:(0, 8) NewAllocaBegin:(0, 16)
original: store i64 %cond.coerce.fca.0.extract, ptr %cond.coerce.fca.0.gep, align 8
insert: %cond.sroa.0.0.vec.insert = insertelement <2 x i64> %cond.sroa.0.0.load9, i64 %cond.coerce.fca.0.extract, i32 0
to: store <2 x i64> %cond.sroa.0.0.vec.insert, ptr %cond.sroa.0, align 16
rewriting [8,16) slice #2
Begin:(8, 16) NewBegin:(8, 16) NewAllocaBegin:(0, 16)
original: store i64 %cond.coerce.fca.1.extract, ptr %cond.coerce.fca.1.gep, align 8
insert: %cond.sroa.0.8.vec.insert = insertelement <2 x i64> %cond.sroa.0.8.load, i64 %cond.coerce.fca.1.extract, i32 1
to: store <2 x i64> %cond.sroa.0.8.vec.insert, ptr %cond.sroa.0, align 16
```
after this patch
```
Rewriting alloca partition [0,16) to: %cond.sroa.0 = alloca <4 x i32>, align 16
rewriting [0,16) slice #0
Begin:(0, 16) NewBegin:(0, 16) NewAllocaBegin:(0, 16)
original: %2 = load <4 x i32>, ptr %m5, align 8, !tbaa !8
to: %cond.sroa.0.0.load = load <4 x i32>, ptr %cond.sroa.0, align 16
rewriting [0,8) slice #1
Begin:(0, 8) NewBegin:(0, 8) NewAllocaBegin:(0, 16)
original: store i64 %cond.coerce.fca.0.extract, ptr %cond.coerce.fca.0.gep, align 8
shuffle: %cond.sroa.0.0.vec.expand = shufflevector <2 x i32> %0, <2 x i32> poison, <4 x i32> <i32 0, i32 1, i32 poison, i32 poison>
blend: %cond.sroa.0.0.vecblend = select <4 x i1> <i1 true, i1 true, i1 false, i1 false>, <4 x i32> %cond.sroa.0.0.vec.expand, <4 x i32> %cond.sroa.0.0.load9
to: store <4 x i32> %cond.sroa.0.0.vecblend, ptr %cond.sroa.0, align 16
rewriting [8,16) slice #2
Begin:(8, 16) NewBegin:(8, 16) NewAllocaBegin:(0, 16)
original: store i64 %cond.coerce.fca.1.extract, ptr %cond.coerce.fca.1.gep, align 8
shuffle: %cond.sroa.0.8.vec.expand = shufflevector <2 x i32> %1, <2 x i32> poison, <4 x i32> <i32 poison, i32 poison, i32 0, i32 1>
blend: %cond.sroa.0.8.vecblend = select <4 x i1> <i1 false, i1 false, i1 true, i1 true>, <4 x i32> %cond.sroa.0.8.vec.expand, <4 x i32> %cond.sroa.0.8.load
to: store <4 x i32> %cond.sroa.0.8.vecblend, ptr %cond.sroa.0, align 16
```
The shufflevector and blend give worse codegen than the previous code.
https://github.com/llvm/llvm-project/pull/77678
More information about the llvm-commits
mailing list