[llvm] [SDag] Notify listeners when deleting a node (PR #66991)
Sergei Barannikov via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 21 02:20:05 PDT 2023
================
@@ -29,7 +33,9 @@ define <8 x i32> @shuffle_v8i32_0dcd3f14_constant(<8 x i32> %a0) {
; CHECK-NEXT: vblendps {{.*#+}} xmm1 = xmm1[0],xmm0[1,2,3]
; CHECK-NEXT: vshufps {{.*#+}} xmm1 = xmm1[3,1,1,0]
; CHECK-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
-; CHECK-NEXT: vblendps {{.*#+}} ymm0 = ymm0[0],mem[1,2,3],ymm0[4],mem[5],ymm0[6,7]
+; CHECK-NEXT: vbroadcastf128 {{.*#+}} ymm1 = mem[0,1,0,1]
+; CHECK-NEXT: vshufpd {{.*#+}} ymm1 = ymm1[0,0,3,2]
+; CHECK-NEXT: vblendps {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3],ymm0[4],ymm1[5],ymm0[6,7]
----------------
s-barannikov wrote:
This looks like a regression.
```
Legalizing: t50: v4f64 = vector_shuffle<2,2,3,u> t48, undef:v4f64
Trying custom legalization
Successfully custom legalized node
... replacing: t50: v4f64 = vector_shuffle<2,2,3,u> t48, undef:v4f64
with: t53: v4f64 = vector_shuffle<0,0,3,u> t52, undef:v4f64
```
Before this patch, t53 and t50 are not getting legalized during DAG legalization and the "legalized" DAG is:
```
SelectionDAG has 32 nodes:
t0: ch,glue = EntryToken
t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %0
t28: i64 = X86ISD::Wrapper TargetConstantPool:i64<<8 x i32> <i32 undef, i32 undef, i32 undef, i32 undef, i32 13, i32 14, i32 undef, i32 16>> 0
t26: v8i32,ch = load<(load (s256) from constant-pool)> t0, t28, undef:i64
t48: v4f64 = bitcast t26
t52: v4f64 = vector_shuffle<2,3,2,3> t48, undef:v4f64
t53: v4f64 = vector_shuffle<0,0,3,u> t52, undef:v4f64
t51: v8f32 = bitcast t53
...
```
t53 is legalized by DAG combiner into VPERMILPI, which is immediately folded with t52 and t26 into:
`v4f64 = BUILD_VECTOR ConstantFP:f64<2.970794e-313>, ConstantFP:f64<2.970794e-313>, ConstantFP:f64<3.395193e-313>, ConstantFP:f64<2.970794e-313>`
After that t52 is deleted as dead without being legalized.
After this patch, t52 and t53 are legalized during DAG legalization phase, and the resulting DAG is:
```
SelectionDAG has 31 nodes:
t0: ch,glue = EntryToken
t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %0
t28: i64 = X86ISD::Wrapper TargetConstantPool:i64<<8 x i32> <i32 undef, i32 undef, i32 undef, i32 undef, i32 13, i32 14, i32 undef, i32 16>> 0
t66: i64 = add t28, Constant:i64<16>
t67: v4f64,ch = X86ISD::SUBV_BROADCAST_LOAD<(load (s128) from constant-pool + 16, basealign 32)> t0, t66
t64: v4f64 = X86ISD::VPERMILPI t67, TargetConstant:i8<4>
t51: v8f32 = bitcast t64
...
```
VPERMILPI is not combined with SUBV_BROADCAST_LOAD presumably because the optimizer cannot get through `t66 add` to extract shuffle mask stored in constant pool.
I don't know if this is a serious regression. The test seems to be testing something different.
https://github.com/llvm/llvm-project/pull/66991
More information about the llvm-commits
mailing list