[PATCH] D127115: [RFC][DAGCombine] Make sure combined nodes are added back to the worklist in topological order.
    Simon Pilgrim via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Sat Jul 16 05:19:19 PDT 2022
    
    
  
RKSimon added inline comments.
================
Comment at: llvm/test/CodeGen/PowerPC/testComparesigeuc.ll:72
+; CHECK-NEXT:    not r3, r3
+; CHECK-NEXT:    clrlwi r3, r3, 31
 ; CHECK-NEXT:    stb r3, glob at toc@l(r5)
----------------
RKSimon wrote:
> RKSimon wrote:
> > RKSimon wrote:
> > > deadalnix wrote:
> > > > Both start from:
> > > > ```
> > > > SelectionDAG has 28 nodes:
> > > >   t0: ch = EntryToken
> > > >       t2: i64,ch = CopyFromReg t0, Register:i64 %0
> > > >     t28: i64 = AssertZext t2, ValueType:ch:i8
> > > >   t31: i32 = truncate t28
> > > >       t7: i64,ch = CopyFromReg t0, Register:i64 %1
> > > >     t25: i64 = AssertZext t7, ValueType:ch:i8
> > > >   t32: i32 = truncate t25
> > > >   t16: i1 = setcc t31, t32, setuge:ch
> > > >                 t46: i64 = zero_extend t31, Constant:i32<64>
> > > >                 t47: i64 = zero_extend t32, Constant:i32<64>
> > > >               t48: i64 = sub t46, t47
> > > >             t50: i64 = srl t48, Constant:i32<63>
> > > >           t52: i64 = xor t50, Constant:i64<1>
> > > >         t53: i1 = truncate t52
> > > >       t36: i32 = zero_extend t53
> > > >       t42: i64,ch = PPCISD::TOC_ENTRY<(load (s64) from got)> TargetGlobalAddress:i64<i8* @glob> 0,$
> > > >     t37: ch = store<(store (s8) into @glob), trunc to i8> t0, t36, t42, undef:i64
> > > >   t22: ch = PPCISD::RET_FLAG t37
> > > > ```
> > > > 
> > > > With this patch, the first significant node to be combined is `t53: i1 = truncate t52`, which goes through SimplifytDemandedBits and the DAG comes out as:
> > > > ```
> > > > SelectionDAG has 28 nodes:
> > > >   t0: ch = EntryToken
> > > >             t2: i64,ch = CopyFromReg t0, Register:i64 %0
> > > >           t28: i64 = AssertZext t2, ValueType:ch:i8
> > > >         t31: i32 = truncate t28
> > > >       t46: i64 = zero_extend t31, Constant:i32<64>
> > > >             t7: i64,ch = CopyFromReg t0, Register:i64 %1
> > > >           t25: i64 = AssertZext t7, ValueType:ch:i8
> > > >         t32: i32 = truncate t25
> > > >       t47: i64 = zero_extend t32, Constant:i32<64>
> > > >     t48: i64 = sub t46, t47
> > > >   t50: i64 = srl t48, Constant:i32<63>
> > > >   t52: i64 = xor t50, Constant:i64<1>
> > > >           t55: i64 = xor t50, Constant:i64<-1>
> > > >         t53: i1 = truncate t55
> > > >       t36: i32 = zero_extend t53
> > > >       t42: i64,ch = PPCISD::TOC_ENTRY<(load (s64) from got)> TargetGlobalAddress:i64<i8* @glob> 0,$
> > > >     t37: ch = store<(store (s8) into @glob), trunc to i8> t0, t36, t42, undef:i64
> > > >   t22: ch = PPCISD::RET_FLAG t37
> > > > ```
> > > > 
> > > > Previously, the first significant node to be combined was `t52: i64 = xor t50, Constant:i64<1>`, creating this DAG:
> > > > ```
> > > > SelectionDAG has 28 nodes:
> > > >   t0: ch = EntryToken
> > > >           t2: i64,ch = CopyFromReg t0, Register:i64 %0
> > > >         t28: i64 = AssertZext t2, ValueType:ch:i8
> > > >       t31: i32 = truncate t28
> > > >     t46: i64 = zero_extend t31, Constant:i32<64>
> > > >           t7: i64,ch = CopyFromReg t0, Register:i64 %1
> > > >         t25: i64 = AssertZext t7, ValueType:ch:i8
> > > >       t32: i32 = truncate t25
> > > >     t47: i64 = zero_extend t32, Constant:i32<64>
> > > >   t48: i64 = sub t46, t47
> > > >     t50: i64 = srl t48, Constant:i32<63>
> > > >   t52: i64 = xor t50, Constant:i64<1>
> > > >           t56: i64 = xor t48, Constant:i64<-1>
> > > >         t57: i64 = srl t56, Constant:i32<63>
> > > >       t54: i32 = truncate t57
> > > >       t42: i64,ch = PPCISD::TOC_ENTRY<(load (s64) from got)> TargetGlobalAddress:i64<i8* @glob> 0, $
> > > >     t37: ch = store<(store (s8) into @glob), trunc to i8> t0, t54, t42, undef:i64
> > > >   t22: ch = PPCISD::RET_FLAG t37
> > > > ```
> > > > 
> > > > Which creates a ton redundancy, but that gets cleaned up.
> > > DAGCombine has a  "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold - but it requires a very specific XoR constant value - I wonder if we moved it to SimplifyDemandedBits we could generalize it slightly?
> > Yup, looks to be OK: https://alive2.llvm.org/ce/z/WSY3Kb - I'll see if I can get it to fix this regression
> Sorry - wrong assumption (still correct though): https://alive2.llvm.org/ce/z/mE3NHT
https://reviews.llvm.org/D129933 should address the PowerPC/testCompares* regressions
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127115/new/
https://reviews.llvm.org/D127115
    
    
More information about the llvm-commits
mailing list