[llvm] r335183 - Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.

Alina Sbirlea via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 21 16:20:22 PDT 2018


Hi Justin,

Yes, I believe it is expected in some cases, and I did go over this change
offline with some folks because I had the same doubt.
The issue is that this may change the order of blocks, which can affect
codegen. Hence it's not entirely NFC.
For the above example, if you look at the IR change rather than the asm
change, it's just two blocks getting reversed in the list.

The test as seen by "MergeBlocksIntoPred" is:
entry:
  br i1 undef, label %for.cond5.preheader, label %for.end52

for.cond5.preheader:
  br label %for.cond5

for.cond5:
  %or.cond = and i1 undef, false
  br i1 %or.cond, label %for.body33.preheader, label %for.cond5

for.body33.preheader:
  br label %for.body33

for.cond30:
  br label %for.cond5

for.body33:
  %tobool = fcmp une double undef, 0.000000e+00
  br i1 %tobool, label %if.then, label %for.cond30

if.then:
  call void @scale()
  br label %for.cond30

for.end52:
  ret void


Now, calling MergeBlocksIntoPred, will attempt to merge
for.body33.preheader and for.body33, which happen to have for.cond30
inbetween.
for.body33.preheader:
  br label %for.body33

for.cond30:
  br label %for.cond5

for.body33:
  %tobool = fcmp une double undef, 0.000000e+00
  br i1 %tobool, label %if.then, label %for.cond30

Before this patch, the two would be merged "down", so you delete the
preheader and get:
for.cond5:
  %or.cond = and i1 undef, false
  br i1 %or.cond, label %for.body33, label %for.cond5

for.cond30:
  br label %for.cond5

for.body33:
  %tobool = fcmp une double undef, 0.000000e+00
  br i1 %tobool, label %if.then, label %for.cond30

After the patch, they're merged "up", i.e. "into the predecessor", so you
get:
for.cond5:
  %or.cond = and i1 undef, false
  br i1 %or.cond, label %for.body33.preheader, label %for.cond5

for.body33.preheader:
  %tobool = fcmp une double undef, 0.000000e+00
  br i1 %tobool, label %if.then, label %for.cond30

for.cond30:
  br label %for.cond5

So the CFG is identical, with a different label name. But the BBs will
appear in a different order, so the layout changes. This can factor into
codegen both good and bad.

For "test/CodeGen/X86/hoist-spill.ll", this appeared to factor in a good
way. The test checked:
2 x CHECK: mov{{.}} %{{.*}}, [[SPOFFSET1:-?[0-9]*]](%rsp)
2 x CHECK-NOT: mov{{.}} %{{.*}}, [[SPOFFSET2]](%rsp)
After the patch, there was no second movl with negative offset or any
reloads. This should be good. The condition the test checks: "no spills to
the same stack slot after hoisting." is true, but the code generated
changed. Again, because there are blocks in between the merged blocks so
the order changed.
Now, to satisfy what the test is supposed to check for, I moved below the
empty block about to be merged, so two "movl minus offset" are still
generated rather than just one and we can still check for two "movl offset"
not existing.

Please let me know if this makes sense or I missed something.

Thanks,
Alina


On Thu, Jun 21, 2018 at 3:18 PM Justin Bogner <mail at justinbogner.com> wrote:

> Alina Sbirlea via llvm-commits <llvm-commits at lists.llvm.org> writes:
> > Author: asbirlea
> > Date: Wed Jun 20 15:01:04 2018
> > New Revision: 335183
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=335183&view=rev
> > Log:
> > Generalize MergeBlockIntoPredecessor. Replace uses of
> MergeBasicBlockIntoOnlyPred.
> >
> > Summary:
> > Two utils methods have essentially the same functionality. This is an
> attempt to merge them into one.
> > 1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
> > 2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor
> >
> > Prior to the patch:
> > 1. MergeBasicBlockIntoOnlyPred
> > Updates either DomTree or DeferredDominance
> > Moves all instructions from Pred to BB, deletes Pred
> > Asserts BB has single predecessor
> > If address was taken, replace the block address with constant 1 (?)
> >
> > 2. MergeBlockIntoPredecessor
> > Updates DomTree, LoopInfo and MemoryDependenceResults
> > Moves all instruction from BB to Pred, deletes BB
> > Returns if doesn't have a single predecessor
> > Returns if BB's address was taken
> >
> > After the patch:
> > Method 2. MergeBlockIntoPredecessor is attempting to become the new
> default:
> > Updates DomTree or DeferredDominance, and LoopInfo and
> MemoryDependenceResults
> > Moves all instruction from BB to Pred, deletes BB
> > Returns if doesn't have a single predecessor
> > Returns if BB's address was taken
> >
> > Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:
> >
> > 1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp
> > Updated in this patch. No challenges.
> >
> > 2. lib/CodeGen/CodeGenPrepare.cpp
> > Updated in this patch.
> >   i. eliminateFallThrough is straightforward, but I added using a
> temporary array to avoid the iterator invalidation.
> >   ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary
> array for blocks
> > Some interesting aspects:
> >   - Since Pred is not deleted (BB is), the entry block does not need
> updating.
> >   - The entry block was being updated with the deleted block in
> eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
> >   - isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
> >   - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it
> deletes its unique predecessor instead.
> >   - adding some test owner as subscribers for the interesting tests
> modified:
> >     test/CodeGen/X86/avx-cmp.ll
> >     test/CodeGen/AMDGPU/nested-loop-conditions.ll
> >     test/CodeGen/AMDGPU/si-annotate-cf.ll
> >     test/CodeGen/X86/hoist-spill.ll
> >     test/CodeGen/X86/2006-11-17-IllegalMove.ll
> >
> > 3. lib/Transforms/Scalar/JumpThreading.cpp
> > Not covered in this patch. It is the only use case using the
> DeferredDominance.
> > I would defer to Brian Rzycki to make this replacement.
> >
> > Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar
> >
> > Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr,
> kbarton, RKSimon, wmi, arsenm, llvm-commits
> >
> > Differential Revision: https://reviews.llvm.org/D48202
> >
> > Modified:
> >     llvm/trunk/include/llvm/Transforms/Utils/BasicBlockUtils.h
> >     llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
> >     llvm/trunk/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
> >     llvm/trunk/lib/Transforms/Utils/BasicBlockUtils.cpp
> >     llvm/trunk/test/CodeGen/AMDGPU/branch-relaxation.ll
> >     llvm/trunk/test/CodeGen/AMDGPU/nested-loop-conditions.ll
> >     llvm/trunk/test/CodeGen/AMDGPU/si-annotate-cf.ll
> >     llvm/trunk/test/CodeGen/ARM/indirectbr.ll
> >     llvm/trunk/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll
> >     llvm/trunk/test/CodeGen/PowerPC/memcmp-mergeexpand.ll
> >     llvm/trunk/test/CodeGen/PowerPC/ppc-shrink-wrapping.ll
> >     llvm/trunk/test/CodeGen/PowerPC/simplifyConstCmpToISEL.ll
> >     llvm/trunk/test/CodeGen/Thumb2/2010-11-22-EpilogueBug.ll
> >     llvm/trunk/test/CodeGen/Thumb2/thumb2-jtb.ll
> >     llvm/trunk/test/CodeGen/X86/2006-11-17-IllegalMove.ll
> >     llvm/trunk/test/CodeGen/X86/avx-cmp.ll
> >     llvm/trunk/test/CodeGen/X86/avx-splat.ll
> >     llvm/trunk/test/CodeGen/X86/avx2-vbroadcast.ll
> >     llvm/trunk/test/CodeGen/X86/avx512-i1test.ll
> >     llvm/trunk/test/CodeGen/X86/block-placement.ll
> >     llvm/trunk/test/CodeGen/X86/hoist-spill.ll
> >     llvm/trunk/test/CodeGen/X86/ins_subreg_coalesce-1.ll
> >     llvm/trunk/test/CodeGen/X86/memcmp-mergeexpand.ll
> >     llvm/trunk/test/CodeGen/X86/pr32108.ll
> >     llvm/trunk/test/CodeGen/X86/setcc-lowering.ll
> >     llvm/trunk/test/CodeGen/X86/split-store.ll
> >     llvm/trunk/test/CodeGen/X86/tail-dup-merge-loop-headers.ll
> >     llvm/trunk/test/DebugInfo/Generic/sunk-compare.ll
> >     llvm/trunk/test/Transforms/CodeGenPrepare/X86/computedgoto.ll
> >     llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll
> >     llvm/trunk/test/Transforms/LoopSimplifyCFG/scev.ll
> >
>  llvm/trunk/test/Transforms/LoopStrengthReduce/ARM/2012-06-15-lsr-noaddrmode.ll
> >
>  llvm/trunk/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll
> >
>  llvm/trunk/test/Transforms/SimpleLoopUnswitch/trivial-unswitch-iteration.ll
>
> > Modified: llvm/trunk/test/CodeGen/X86/avx-cmp.ll
> > URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-cmp.ll?rev=335183&r1=335182&r2=335183&view=diff
> >
> ==============================================================================
> > --- llvm/trunk/test/CodeGen/X86/avx-cmp.ll (original)
> > +++ llvm/trunk/test/CodeGen/X86/avx-cmp.ll Wed Jun 20 15:01:04 2018
> > @@ -26,12 +26,15 @@ declare void @scale() nounwind
> >  define void @render() nounwind {
> >  ; CHECK-LABEL: render:
> >  ; CHECK:       # %bb.0: # %entry
> > +; CHECK-NEXT:    pushq %rbp
> >  ; CHECK-NEXT:    pushq %rbx
> > +; CHECK-NEXT:    pushq %rax
> >  ; CHECK-NEXT:    xorl %eax, %eax
> >  ; CHECK-NEXT:    testb %al, %al
> >  ; CHECK-NEXT:    jne .LBB2_6
> >  ; CHECK-NEXT:  # %bb.1: # %for.cond5.preheader
> >  ; CHECK-NEXT:    xorl %ebx, %ebx
> > +; CHECK-NEXT:    movb $1, %bpl
> >  ; CHECK-NEXT:    jmp .LBB2_2
> >  ; CHECK-NEXT:    .p2align 4, 0x90
> >  ; CHECK-NEXT:  .LBB2_5: # %if.then
> > @@ -43,8 +46,8 @@ define void @render() nounwind {
> >  ; CHECK-NEXT:    jne .LBB2_2
> >  ; CHECK-NEXT:  # %bb.3: # %for.cond5
> >  ; CHECK-NEXT:    # in Loop: Header=BB2_2 Depth=1
> > -; CHECK-NEXT:    testb %bl, %bl
> > -; CHECK-NEXT:    je .LBB2_2
> > +; CHECK-NEXT:    testb %bpl, %bpl
> > +; CHECK-NEXT:    jne .LBB2_2
> >  ; CHECK-NEXT:  # %bb.4: # %for.body33
> >  ; CHECK-NEXT:    # in Loop: Header=BB2_2 Depth=1
> >  ; CHECK-NEXT:    vucomisd {{\.LCPI.*}}, %xmm0
> > @@ -52,7 +55,9 @@ define void @render() nounwind {
> >  ; CHECK-NEXT:    jp .LBB2_5
> >  ; CHECK-NEXT:    jmp .LBB2_2
> >  ; CHECK-NEXT:  .LBB2_6: # %for.end52
> > +; CHECK-NEXT:    addq $8, %rsp
> >  ; CHECK-NEXT:    popq %rbx
> > +; CHECK-NEXT:    popq %rbp
> >  ; CHECK-NEXT:    retq
> >  entry:
> >    br i1 undef, label %for.cond5, label %for.end52
>
> This test change looks a bit suspicious - I would've expected your patch
> to be basically NFC, but here we end up using a few more registers. Is
> this expected?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180621/8cd00aab/attachment.html>


More information about the llvm-commits mailing list