[PATCH] D124743: [DAGCombine] Add node in the worklist in topological order in CombineTo
Amaury SECHET via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 6 05:44:51 PDT 2022
deadalnix added inline comments.
================
Comment at: llvm/test/CodeGen/ARM/swifterror.ll:687
+; CHECK-APPLE-NEXT: add r0, r0, #4
+; CHECK-APPLE-NEXT: ldr r2, [r7, #8]
+; CHECK-APPLE-NEXT: ldr r1, [r0], #4
----------------
deadalnix wrote:
> RKSimon wrote:
> > This looks like another lost multiple load/store ?
> Yes, this looks similar to what is happening to arm64-abi-varargs.ll .
Investigating this a bit more, it looks like we get some crazy combine happening when CombineTo work in topological order that are missed otherwise.
The dag looks like this with topological CombineTo:
```
SelectionDAG has 58 nodes:
t0: ch = EntryToken
t5: i32,ch = CopyFromReg t0, Register:i32 %1
t9: i32 = add FrameIndex:i32<-1>, Constant:i32<4>
t11: i32,ch = CopyFromReg t0, Register:i32 %2
t15: i32,ch = CopyFromReg t0, Register:i32 %3
t19: i32,ch = CopyFromReg t0, Register:i32 %4
t73: i32 = add FrameIndex:i32<-1>, Constant:i32<12>
t68: ch = store<(store (s32) into unknown-address + 12)> t0, t19, t73, undef:i32
t77: i32 = add FrameIndex:i32<-1>, Constant:i32<8>
t74: ch = store<(store (s32) into unknown-address + 8)> t0, t15, t77, undef:i32
t78: ch = store<(store (s32) into %fixed-stack.0 + 4)> t0, t11, t9, undef:i32
t81: ch = store<(store (s32) into %fixed-stack.0)> t0, t5, FrameIndex:i32<-1>, undef:i32
t83: ch = TokenFactor t19:1, t68, t15:1, t74, t11:1, t78, t5:1, t81
t29: ch,glue = callseq_start t83, TargetConstant:i32<0>, TargetConstant:i32<0>
t33: ch,glue = CopyToReg t29, Register:i32 $r0, Constant:i32<16>
t35: ch,glue = CopyToReg t33, Register:i32 $r1, Constant:i32<0>, t33:1
t38: ch,glue = ARMISD::CALL t35, TargetGlobalAddress:i32<i8* (i64)* @malloc> 0, Register:i32 $r0, Register:i32 $r1, RegisterMask:Untyped, t35:1
t40: ch,glue = callseq_end t38, TargetConstant:i32<0>, TargetConstant:i32<-1>, t38:1
t41: i32,ch,glue = CopyFromReg t40, Register:i32 $r0, t40:1
t43: ch = CopyToReg t41:1, Register:i32 %5, t41
t45: i32 = add nuw t41, Constant:i32<8>
t85: ch = store<(store (s8) into %ir.tmp), trunc to i8> t43, Constant:i32<1>, t45, undef:i32
t137: ch = store<(store (s32) into %ir.a12)> t43, t135, FrameIndex:i32<3>, undef:i32
t128: ch = store<(store (s32) into %ir.args, align 8)> t43, t135:1, FrameIndex:i32<0>, undef:i32
t140: ch = store<(store (s32) into %ir.a11)> t43, t122, FrameIndex:i32<2>, undef:i32
t144: ch = store<(store (s32) into %ir.a10)> t43, t109, FrameIndex:i32<1>, undef:i32
t147: ch = TokenFactor t137, t135:2, t128, t140, t122:2, t144, t109:1
t64: ch,glue = CopyToReg t147, Register:i32 $r0, Constant:i32<1065353216>
t66: ch,glue = CopyToReg t64, Register:i32 $r8, Register:i32 %5, t64:1
t109: i32,ch = load<(load (s32))> t85, FrameIndex:i32<-1>, undef:i32
t122: i32,i32,ch = load<(load (s32) from %fixed-stack.0 + 8), <post-inc>> t85, t9, Constant:i32<4>
t135: i32,i32,ch = load<(load (s32)), <post-inc>> t85, t122:1, Constant:i32<4>
t67: ch = ARMISD::RET_FLAG t66, Register:i32 $r0, Register:i32 $r8, t66:1
```
And without:
```
SelectionDAG has 58 nodes:
t0: ch = EntryToken
t5: i32,ch = CopyFromReg t0, Register:i32 %1
t9: i32 = add FrameIndex:i32<-1>, Constant:i32<4>
t11: i32,ch = CopyFromReg t0, Register:i32 %2
t15: i32,ch = CopyFromReg t0, Register:i32 %3
t19: i32,ch = CopyFromReg t0, Register:i32 %4
t77: i32 = add FrameIndex:i32<-1>, Constant:i32<8>
t73: i32 = add FrameIndex:i32<-1>, Constant:i32<12>
t68: ch = store<(store (s32) into unknown-address + 12)> t0, t19, t73, undef:i32
t74: ch = store<(store (s32) into unknown-address + 8)> t0, t15, t77, undef:i32
t78: ch = store<(store (s32) into %fixed-stack.0 + 4)> t0, t11, t9, undef:i32
t81: ch = store<(store (s32) into %fixed-stack.0)> t0, t5, FrameIndex:i32<-1>, undef:i32
t83: ch = TokenFactor t19:1, t68, t15:1, t74, t11:1, t78, t5:1, t81
t29: ch,glue = callseq_start t83, TargetConstant:i32<0>, TargetConstant:i32<0>
t33: ch,glue = CopyToReg t29, Register:i32 $r0, Constant:i32<16>
t35: ch,glue = CopyToReg t33, Register:i32 $r1, Constant:i32<0>, t33:1
t38: ch,glue = ARMISD::CALL t35, TargetGlobalAddress:i32<i8* (i64)* @malloc> 0, Register:i32 $r0, Register:i32 $r1, RegisterMask:Untyped, t35:1
t40: ch,glue = callseq_end t38, TargetConstant:i32<0>, TargetConstant:i32<-1>, t38:1
t41: i32,ch,glue = CopyFromReg t40, Register:i32 $r0, t40:1
t43: ch = CopyToReg t41:1, Register:i32 %5, t41
t45: i32 = add nuw t41, Constant:i32<8>
t85: ch = store<(store (s8) into %ir.tmp), trunc to i8> t43, Constant:i32<1>, t45, undef:i32
t135: ch = store<(store (s32) into %ir.a12)> t43, t132, FrameIndex:i32<3>, undef:i32
t127: ch = store<(store (s32) into %ir.args, align 8)> t43, t73, FrameIndex:i32<0>, undef:i32
t138: ch = store<(store (s32) into %ir.a11)> t43, t120, FrameIndex:i32<2>, undef:i32
t142: ch = store<(store (s32) into %ir.a10)> t43, t109, FrameIndex:i32<1>, undef:i32
t145: ch = TokenFactor t135, t132:1, t127, t138, t120:1, t142, t109:1
t64: ch,glue = CopyToReg t145, Register:i32 $r0, Constant:i32<1065353216>
t66: ch,glue = CopyToReg t64, Register:i32 $r8, Register:i32 %5, t64:1
t109: i32,ch = load<(load (s32))> t85, FrameIndex:i32<-1>, undef:i32
t120: i32,ch = load<(load (s32))> t85, t9, undef:i32
t132: i32,ch = load<(load (s32))> t85, t77, undef:i32
t67: ch = ARMISD::RET_FLAG t66, Register:i32 $r0, Register:i32 $r8, t66:1
```
The meaningful change seems to be these crazy loads:
```
t122: i32,i32,ch = load<(load (s32) from %fixed-stack.0 + 8), <post-inc>> t85, t9, Constant:i32<4>
t135: i32,i32,ch = load<(load (s32)), <post-inc>> t85, t122:1, Constant:i32<4>
```
It'd be really helpful to have some ARM expert here to help us knowing what the right path forward is.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124743/new/
https://reviews.llvm.org/D124743
More information about the llvm-commits
mailing list