[llvm] r335183 - Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.
Justin Bogner via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 21 15:18:46 PDT 2018
Alina Sbirlea via llvm-commits <llvm-commits at lists.llvm.org> writes:
> Author: asbirlea
> Date: Wed Jun 20 15:01:04 2018
> New Revision: 335183
>
> URL: http://llvm.org/viewvc/llvm-project?rev=335183&view=rev
> Log:
> Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.
>
> Summary:
> Two utils methods have essentially the same functionality. This is an attempt to merge them into one.
> 1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
> 2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor
>
> Prior to the patch:
> 1. MergeBasicBlockIntoOnlyPred
> Updates either DomTree or DeferredDominance
> Moves all instructions from Pred to BB, deletes Pred
> Asserts BB has single predecessor
> If address was taken, replace the block address with constant 1 (?)
>
> 2. MergeBlockIntoPredecessor
> Updates DomTree, LoopInfo and MemoryDependenceResults
> Moves all instruction from BB to Pred, deletes BB
> Returns if doesn't have a single predecessor
> Returns if BB's address was taken
>
> After the patch:
> Method 2. MergeBlockIntoPredecessor is attempting to become the new default:
> Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults
> Moves all instruction from BB to Pred, deletes BB
> Returns if doesn't have a single predecessor
> Returns if BB's address was taken
>
> Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:
>
> 1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp
> Updated in this patch. No challenges.
>
> 2. lib/CodeGen/CodeGenPrepare.cpp
> Updated in this patch.
> i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation.
> ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks
> Some interesting aspects:
> - Since Pred is not deleted (BB is), the entry block does not need updating.
> - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
> - isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
> - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead.
> - adding some test owner as subscribers for the interesting tests modified:
> test/CodeGen/X86/avx-cmp.ll
> test/CodeGen/AMDGPU/nested-loop-conditions.ll
> test/CodeGen/AMDGPU/si-annotate-cf.ll
> test/CodeGen/X86/hoist-spill.ll
> test/CodeGen/X86/2006-11-17-IllegalMove.ll
>
> 3. lib/Transforms/Scalar/JumpThreading.cpp
> Not covered in this patch. It is the only use case using the DeferredDominance.
> I would defer to Brian Rzycki to make this replacement.
>
> Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar
>
> Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D48202
>
> Modified:
> llvm/trunk/include/llvm/Transforms/Utils/BasicBlockUtils.h
> llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
> llvm/trunk/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
> llvm/trunk/lib/Transforms/Utils/BasicBlockUtils.cpp
> llvm/trunk/test/CodeGen/AMDGPU/branch-relaxation.ll
> llvm/trunk/test/CodeGen/AMDGPU/nested-loop-conditions.ll
> llvm/trunk/test/CodeGen/AMDGPU/si-annotate-cf.ll
> llvm/trunk/test/CodeGen/ARM/indirectbr.ll
> llvm/trunk/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll
> llvm/trunk/test/CodeGen/PowerPC/memcmp-mergeexpand.ll
> llvm/trunk/test/CodeGen/PowerPC/ppc-shrink-wrapping.ll
> llvm/trunk/test/CodeGen/PowerPC/simplifyConstCmpToISEL.ll
> llvm/trunk/test/CodeGen/Thumb2/2010-11-22-EpilogueBug.ll
> llvm/trunk/test/CodeGen/Thumb2/thumb2-jtb.ll
> llvm/trunk/test/CodeGen/X86/2006-11-17-IllegalMove.ll
> llvm/trunk/test/CodeGen/X86/avx-cmp.ll
> llvm/trunk/test/CodeGen/X86/avx-splat.ll
> llvm/trunk/test/CodeGen/X86/avx2-vbroadcast.ll
> llvm/trunk/test/CodeGen/X86/avx512-i1test.ll
> llvm/trunk/test/CodeGen/X86/block-placement.ll
> llvm/trunk/test/CodeGen/X86/hoist-spill.ll
> llvm/trunk/test/CodeGen/X86/ins_subreg_coalesce-1.ll
> llvm/trunk/test/CodeGen/X86/memcmp-mergeexpand.ll
> llvm/trunk/test/CodeGen/X86/pr32108.ll
> llvm/trunk/test/CodeGen/X86/setcc-lowering.ll
> llvm/trunk/test/CodeGen/X86/split-store.ll
> llvm/trunk/test/CodeGen/X86/tail-dup-merge-loop-headers.ll
> llvm/trunk/test/DebugInfo/Generic/sunk-compare.ll
> llvm/trunk/test/Transforms/CodeGenPrepare/X86/computedgoto.ll
> llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll
> llvm/trunk/test/Transforms/LoopSimplifyCFG/scev.ll
> llvm/trunk/test/Transforms/LoopStrengthReduce/ARM/2012-06-15-lsr-noaddrmode.ll
> llvm/trunk/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll
> llvm/trunk/test/Transforms/SimpleLoopUnswitch/trivial-unswitch-iteration.ll
> Modified: llvm/trunk/test/CodeGen/X86/avx-cmp.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-cmp.ll?rev=335183&r1=335182&r2=335183&view=diff
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/avx-cmp.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/avx-cmp.ll Wed Jun 20 15:01:04 2018
> @@ -26,12 +26,15 @@ declare void @scale() nounwind
> define void @render() nounwind {
> ; CHECK-LABEL: render:
> ; CHECK: # %bb.0: # %entry
> +; CHECK-NEXT: pushq %rbp
> ; CHECK-NEXT: pushq %rbx
> +; CHECK-NEXT: pushq %rax
> ; CHECK-NEXT: xorl %eax, %eax
> ; CHECK-NEXT: testb %al, %al
> ; CHECK-NEXT: jne .LBB2_6
> ; CHECK-NEXT: # %bb.1: # %for.cond5.preheader
> ; CHECK-NEXT: xorl %ebx, %ebx
> +; CHECK-NEXT: movb $1, %bpl
> ; CHECK-NEXT: jmp .LBB2_2
> ; CHECK-NEXT: .p2align 4, 0x90
> ; CHECK-NEXT: .LBB2_5: # %if.then
> @@ -43,8 +46,8 @@ define void @render() nounwind {
> ; CHECK-NEXT: jne .LBB2_2
> ; CHECK-NEXT: # %bb.3: # %for.cond5
> ; CHECK-NEXT: # in Loop: Header=BB2_2 Depth=1
> -; CHECK-NEXT: testb %bl, %bl
> -; CHECK-NEXT: je .LBB2_2
> +; CHECK-NEXT: testb %bpl, %bpl
> +; CHECK-NEXT: jne .LBB2_2
> ; CHECK-NEXT: # %bb.4: # %for.body33
> ; CHECK-NEXT: # in Loop: Header=BB2_2 Depth=1
> ; CHECK-NEXT: vucomisd {{\.LCPI.*}}, %xmm0
> @@ -52,7 +55,9 @@ define void @render() nounwind {
> ; CHECK-NEXT: jp .LBB2_5
> ; CHECK-NEXT: jmp .LBB2_2
> ; CHECK-NEXT: .LBB2_6: # %for.end52
> +; CHECK-NEXT: addq $8, %rsp
> ; CHECK-NEXT: popq %rbx
> +; CHECK-NEXT: popq %rbp
> ; CHECK-NEXT: retq
> entry:
> br i1 undef, label %for.cond5, label %for.end52
This test change looks a bit suspicious - I would've expected your patch
to be basically NFC, but here we end up using a few more registers. Is
this expected?
More information about the llvm-commits
mailing list