[PATCH] D124743: [DAGCombine] Add node in the worklist in topological order in CombineTo

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 9 01:58:54 PDT 2022


fhahn added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/swifterror.ll:946
+; CHECK-APPLE-AARCH64-NEXT:    str w8, [sp, #16]
+; CHECK-APPLE-AARCH64-NEXT:    ldr w8, [x9], #8
 ; CHECK-APPLE-AARCH64-NEXT:    fmov s0, #1.00000000
----------------
deadalnix wrote:
> Pre DAG:
> ```
> SelectionDAG has 41 nodes:
>       t0: ch = EntryToken
>     t6: ch,glue = callseq_start t0, TargetConstant:i64<0>, TargetConstant:i64<0>
>   t10: ch,glue = CopyToReg t6, Register:i64 $x0, Constant:i64<16>
>   t13: ch,glue = AArch64ISD::CALL t10, TargetGlobalAddress:i64<i8* (i64)* @malloc> 0, Register:i64 $x0, RegisterMask:Untyped, t10:1
>   t14: ch,glue = callseq_end t13, TargetConstant:i64<0>, TargetConstant:i64<0>, t13:1
>   t15: i64,ch,glue = CopyFromReg t14, Register:i64 $x0, t14:1
>   t17: ch = CopyToReg t15:1, Register:i64 %1, t15
>     t19: i64 = add nuw t15, Constant:i64<8>
>   t45: ch = store<(store (s8) into %ir.tmp), trunc to i8> t17, Constant:i32<1>, t19, undef:i64
>       t100: ch = store<(store (s32) into %ir.a12)> t17, t97, FrameIndex:i64<3>, undef:i64
>         t95: i64 = add FrameIndex:i64<-1>, Constant:i64<24>
>       t90: ch = store<(store (s64) into %ir.args)> t17, t95, FrameIndex:i64<0>, undef:i64
>       t103: ch = store<(store (s32) into %ir.a11)> t17, t83, FrameIndex:i64<2>, undef:i64
>       t107: ch = store<(store (s32) into %ir.a10)> t17, t71, FrameIndex:i64<1>, undef:i64
>     t110: ch = TokenFactor t100, t97:1, t90, t103, t83:1, t107, t71:1
>   t40: ch,glue = CopyToReg t110, Register:f32 $s0, ConstantFP:f32<1.000000e+00>
>   t42: ch,glue = CopyToReg t40, Register:i64 $x21, Register:i64 %1, t40:1
>   t71: i32,ch = load<(load (s32) from %fixed-stack.0, align 16)> t45, FrameIndex:i64<-1>, undef:i64
>     t69: i64 = or FrameIndex:i64<-1>, Constant:i64<8>
>   t83: i32,ch = load<(load (s32), align 8)> t45, t69, undef:i64
>     t81: i64 = add FrameIndex:i64<-1>, Constant:i64<16>
>   t97: i32,ch = load<(load (s32) from %fixed-stack.0 + 16, align 16)> t45, t81, undef:i64
>   t43: ch = AArch64ISD::RET_FLAG t42, Register:f32 $s0, Register:i64 $x21, t42:1
> ```
> 
> Post DAG:
> ```
> SelectionDAG has 38 nodes:
>       t0: ch = EntryToken
>     t6: ch,glue = callseq_start t0, TargetConstant:i64<0>, TargetConstant:i64<0>
>   t10: ch,glue = CopyToReg t6, Register:i64 $x0, Constant:i64<16>
>   t13: ch,glue = AArch64ISD::CALL t10, TargetGlobalAddress:i64<i8* (i64)* @malloc> 0, Register:i64 $x0, RegisterMask:Untyped, t10:1
>   t14: ch,glue = callseq_end t13, TargetConstant:i64<0>, TargetConstant:i64<0>, t13:1
>   t15: i64,ch,glue = CopyFromReg t14, Register:i64 $x0, t14:1
>   t17: ch = CopyToReg t15:1, Register:i64 %1, t15
>     t19: i64 = add nuw t15, Constant:i64<8>
>   t45: ch = store<(store (s8) into %ir.tmp), trunc to i8> t17, Constant:i32<1>, t19, undef:i64
>       t99: ch = store<(store (s32) into %ir.a12)> t17, t97, FrameIndex:i64<3>, undef:i64
>       t90: ch = store<(store (s64) into %ir.args)> t17, t97:1, FrameIndex:i64<0>, undef:i64
>       t102: ch = store<(store (s32) into %ir.a11)> t17, t84, FrameIndex:i64<2>, undef:i64
>       t106: ch = store<(store (s32) into %ir.a10)> t17, t70, FrameIndex:i64<1>, undef:i64
>     t109: ch = TokenFactor t99, t97:2, t90, t102, t84:2, t106, t70:1
>   t40: ch,glue = CopyToReg t109, Register:f32 $s0, ConstantFP:f32<1.000000e+00>
>   t42: ch,glue = CopyToReg t40, Register:i64 $x21, Register:i64 %1, t40:1
>   t70: i32,ch = load<(load (s32) from %fixed-stack.0, align 16)> t45, FrameIndex:i64<-1>, undef:i64
>     t73: i64 = or FrameIndex:i64<-1>, Constant:i64<8>
>   t84: i32,i64,ch = load<(load (s32), align 8), <post-inc>> t45, t73, Constant:i64<8>
>   t97: i32,i64,ch = load<(load (s32)), <post-inc>> t45, t84:1, Constant:i64<8>
>   t43: ch = AArch64ISD::RET_FLAG t42, Register:f32 $s0, Register:i64 $x21, t42:1
> ```
> 
> Once again, this looks like it is improving things at the DAGCombine level, but trips up later stages in the backend.
It would be good to investigate what's blocking the load/store optimizer here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124743/new/

https://reviews.llvm.org/D124743



More information about the llvm-commits mailing list