[llvm] 5bc291b - [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent

Nick Desaulniers via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 6 13:46:59 PDT 2020


Author: Nick Desaulniers
Date: 2020-04-06T13:46:39-07:00
New Revision: 5bc291be71548a1e876f9813778099e09543cb82

URL: https://github.com/llvm/llvm-project/commit/5bc291be71548a1e876f9813778099e09543cb82
DIFF: https://github.com/llvm/llvm-project/commit/5bc291be71548a1e876f9813778099e09543cb82.diff

LOG: [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent

Summary:
A bug report mentioned that LLVM was producing jumps off the end of a
function when using "asm goto with outputs". Further digging pointed to
MachineBasicBlocks that had their address taken and were indirect
targets of INLINEASM_BR being removed by BranchFolder, because their
 predecessor list was empty, so they appeared to have no entry.

This was a cascading failure caused earlier, during Pre-RA instruction
scheduling. We have a few special cases in Pre-RA instruction scheduling
where we split a MachineBasicBlock in two.  This requires careful
handing of predecessor and successor lists for a MachineBasicBlock that
was split, and careful handing of PHI MachineInstrs that referred to the
MachineBasicBlock before it was split.

The clue that led to this fix was the observation that many callers of
MachineBasicBlock::splice() frequently call
MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI
nodes after a splice. We don't want to reuse that method, as we have
custom successor transferring logic for this block split.

This patch fixes 2 pre-existing bugs, and adds tests.

The first bug was that MachineBasicBlock::splice() correctly handles
updating most successors and predecessors; we don't need to do anything
more than removing the previous fallthrough block from the first half of
the split block post splice. Previously, we were updating the successor
list incorrectly (updating successors updates predecessors).

The second bug was that PHI nodes that needed registers from the first
half of the split block were not having entries populated.  The register
live out information was correct, and the FuncInfo->PHINodesToUpdate was
correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock:

    for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) {
      MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first);
      if (!FuncInfo->MBB->isSuccessor(PHI->getParent()))
        continue;
      PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB);

was `continue`ing because FuncInfo->MBB tracks the second half of
the post-split block; no one was updating PHI entries for the first half
of the post-split block.

SelectionDAGBuilder::UpdateSplitBlock() already expects to perform
special handling for MachineBasicBlocks that were split post calls to
ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both
correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half
of the split block `CopyBB` which updates `FuncInfo->MBB` (ie. the
current MachineBasicBlock being processed), and perform special handling
for this in SelectionDAGBuilder::UpdateSplitBlock().

Reviewers: void, craig.topper, efriedma

Reviewed By: void, efriedma

Subscribers: hfinkel, fhahn, MatzeB, efriedma, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76961

Added: 
    llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll

Modified: 
    llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
    llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
    llvm/test/CodeGen/X86/callbr-asm-outputs.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
index 9a2f8640a167..95e8d9e53abd 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
@@ -1054,6 +1054,7 @@ EmitSchedule(MachineBasicBlock::iterator &InsertPos) {
     CopyBB->setInlineAsmBrDefaultTarget();
 
     CopyBB->addSuccessor(FallThrough, BranchProbability::getOne());
+    BB->removeSuccessor(FallThrough);
     BB->addSuccessor(CopyBB, BranchProbability::getOne());
 
     // Mark all physical registers defined in the original block as being live
@@ -1068,19 +1069,6 @@ EmitSchedule(MachineBasicBlock::iterator &InsertPos) {
           }
         }
 
-    // Bit of a hack: The copy block we created here exists only because we want
-    // the CFG to work with the current system. However, the successors to the
-    // block with the INLINEASM_BR instruction expect values to come from *that*
-    // block, not this usurper block. Thus we steal its successors and add them
-    // to the copy so that everyone is happy.
-    for (auto *Succ : BB->successors())
-      if (Succ != CopyBB && !CopyBB->isSuccessor(Succ))
-        CopyBB->addSuccessor(Succ, BranchProbability::getZero());
-
-    for (auto *Succ : CopyBB->successors())
-      if (BB->isSuccessor(Succ))
-        BB->removeSuccessor(Succ);
-
     CopyBB->normalizeSuccProbs();
     BB->normalizeSuccProbs();
 

diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index fbe81522f08d..413e49877a10 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -2944,6 +2944,16 @@ void SelectionDAGBuilder::UpdateSplitBlock(MachineBasicBlock *First,
   for (unsigned i = 0, e = SL->BitTestCases.size(); i != e; ++i)
     if (SL->BitTestCases[i].Parent == First)
       SL->BitTestCases[i].Parent = Last;
+
+  // SelectionDAGISel::FinishBasicBlock will add PHI operands for the
+  // successors of the fallthrough block. Here, we add PHI operands for the
+  // successors of the INLINEASM_BR block itself.
+  if (First->getFirstTerminator()->getOpcode() == TargetOpcode::INLINEASM_BR)
+    for (std::pair<MachineInstr *, unsigned> &pair : FuncInfo.PHINodesToUpdate)
+      if (First->isSuccessor(pair.first->getParent()))
+        MachineInstrBuilder(*First->getParent(), pair.first)
+            .addReg(pair.second)
+            .addMBB(First);
 }
 
 void SelectionDAGBuilder::visitIndirectBr(const IndirectBrInst &I) {

diff  --git a/llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll b/llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll
new file mode 100644
index 000000000000..b35aa98e8d83
--- /dev/null
+++ b/llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll
@@ -0,0 +1,73 @@
+; Tests that InstrEmitter::EmitMachineNode correctly sets predecessors and
+; successors.
+
+; RUN: llc -stop-after=finalize-isel -print-after=finalize-isel -mtriple=i686-- < %s 2>&1 | FileCheck %s
+
+; The block containting the INLINEASM_BR should have a fallthrough and its
+; indirect targets as its successors. The fallthrough is a block we synthesized
+; in InstrEmitter::EmitMachineNode. Fallthrough should have 100% branch weight,
+; while the indirect targets have 0%.
+; CHECK: bb.0 (%ir-block.2):
+; CHECK-NEXT: successors: %bb.4(0x00000000), %bb.6(0x80000000); %bb.4(0.00%), %bb.6(100.00%)
+
+; The fallthrough block is predaccessed by the block containing INLINEASM_BR,
+; and succeeded by the INLINEASM_BR's original fallthrough block pre-splitting.
+; CHECK: bb.6 (%ir-block.2):
+; CHECK-NEXT: predecessors: %bb.0
+; CHECK-NEXT: successors: %bb.1(0x80000000); %bb.1(100.00%)
+
+; Another block containing a second INLINEASM_BR. Check it has two successors,
+; and the the probability for fallthrough is 100%. Predecessor check irrelevant.
+; CHECK: bb.1 (%ir-block.4):
+; CHECK: successors: %bb.2(0x00000000), %bb.7(0x80000000); %bb.2(0.00%), %bb.7(100.00%)
+
+; Check the synthesized fallthrough block for the second INLINEASM_BR is
+; preceded correctly, and has the original successor pre-splitting.
+; CHECK: bb.7 (%ir-block.4):
+; CHECK-NEXT: predecessors: %bb.1
+; CHECK-NEXT: successors: %bb.3(0x80000000); %bb.3(100.00%)
+
+; Check the second INLINEASM_BR target block is preceded by the block with the
+; second INLINEASM_BR.
+; CHECK: bb.2 (%ir-block.7, address-taken):
+; CHECK-NEXT: predecessors: %bb.1
+
+; Check the first INLINEASM_BR target block is predecessed by the block with
+; the first INLINEASM_BR.
+; CHECK: bb.4 (%ir-block.11, address-taken):
+; CHECK-NEXT: predecessors: %bb.0
+
+ at .str = private unnamed_addr constant [26 x i8] c"inline asm#1 returned %d\0A\00", align 1
+ at .str.2 = private unnamed_addr constant [26 x i8] c"inline asm#2 returned %d\0A\00", align 1
+ at str = private unnamed_addr constant [30 x i8] c"inline asm#1 caused exception\00", align 1
+ at str.4 = private unnamed_addr constant [30 x i8] c"inline asm#2 caused exception\00", align 1
+
+; Function Attrs: nounwind uwtable
+define dso_local i32 @main(i32 %0, i8** nocapture readnone %1) #0 {
+  %3 = callbr i32 asm "jmp ${1:l}", "=r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@main, %11)) #3
+          to label %4 [label %11]
+
+4:                                                ; preds = %2
+  %5 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([26 x i8], [26 x i8]* @.str, i64 0, i64 0), i32 %3)
+  %6 = callbr i32 asm "jmp ${1:l}", "=r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@main, %7)) #3
+          to label %9 [label %7]
+
+7:                                                ; preds = %4
+  %8 = tail call i32 @puts(i8* nonnull dereferenceable(1) getelementptr inbounds ([30 x i8], [30 x i8]* @str.4, i64 0, i64 0))
+  br label %13
+
+9:                                                ; preds = %4
+  %10 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([26 x i8], [26 x i8]* @.str.2, i64 0, i64 0), i32 %6)
+  br label %13
+
+11:                                               ; preds = %2
+  %12 = tail call i32 @puts(i8* nonnull dereferenceable(1) getelementptr inbounds ([30 x i8], [30 x i8]* @str, i64 0, i64 0))
+  br label %13
+
+13:                                               ; preds = %11, %9, %7
+  %14 = phi i32 [ 1, %7 ], [ 0, %9 ], [ 1, %11 ]
+  ret i32 %14
+}
+
+declare dso_local i32 @printf(i8* nocapture readonly, ...) local_unnamed_addr #1
+declare i32 @puts(i8* nocapture readonly) local_unnamed_addr #2

diff  --git a/llvm/test/CodeGen/X86/callbr-asm-outputs.ll b/llvm/test/CodeGen/X86/callbr-asm-outputs.ll
index 0107db3b7282..8718363f3fd7 100644
--- a/llvm/test/CodeGen/X86/callbr-asm-outputs.ll
+++ b/llvm/test/CodeGen/X86/callbr-asm-outputs.ll
@@ -14,7 +14,10 @@ define i32 @test1(i32 %x) {
 ; CHECK-NEXT:    #NO_APP
 ; CHECK-NEXT:  .LBB0_1: # %normal
 ; CHECK-NEXT:    retl
-; CHECK-NEXT:  .Ltmp0: # Address of block that was removed by CodeGen
+; CHECK-NEXT:  .Ltmp0: # Block address taken
+; CHECK-NEXT:  .LBB0_2: # %abnormal
+; CHECK-NEXT:    movl $1, %eax
+; CHECK-NEXT:    retl
 entry:
   %add = add nsw i32 %x, 4
   %ret = callbr i32 asm "xorl $1, $0; jmp ${2:l}", "=r,r,X,~{dirflag},~{fpsr},~{flags}"(i32 %add, i8* blockaddress(@test1, %abnormal))
@@ -47,25 +50,29 @@ define i32 @test2(i32 %out1, i32 %out2) {
 ; CHECK-NEXT:    testl %edi, %esi
 ; CHECK-NEXT:    jne .Ltmp1
 ; CHECK-NEXT:    #NO_APP
-; CHECK-NEXT:  .LBB1_2:
-; CHECK-NEXT:    jmp .LBB1_4
-; CHECK-NEXT:  .LBB1_3: # %if.else
-; CHECK-NEXT:    #APP
-; CHECK-NEXT:    testl %esi, %edi
-; CHECK-NEXT:    testl %esi, %edi
-; CHECK-NEXT:    jne .Ltmp2
-; CHECK-NEXT:    #NO_APP
-; CHECK-NEXT:  .LBB1_4:
-; CHECK-NEXT:    movl %esi, %eax
-; CHECK-NEXT:    addl %edi, %eax
+; CHECK-NEXT:  .LBB1_2: # %if.then
+; CHECK-NEXT:    movl %edi, %eax
+; CHECK-NEXT:    addl %esi, %eax
 ; CHECK-NEXT:  .Ltmp2: # Block address taken
-; CHECK-NEXT:  # %bb.5: # %return
+; CHECK-NEXT:  .LBB1_6: # %return
 ; CHECK-NEXT:    popl %esi
 ; CHECK-NEXT:    .cfi_def_cfa_offset 8
 ; CHECK-NEXT:    popl %edi
 ; CHECK-NEXT:    .cfi_def_cfa_offset 4
 ; CHECK-NEXT:    retl
-; CHECK-NEXT:  .Ltmp1: # Address of block that was removed by CodeGen
+; CHECK-NEXT:  .LBB1_3: # %if.else
+; CHECK-NEXT:    .cfi_def_cfa_offset 12
+; CHECK-NEXT:    #APP
+; CHECK-NEXT:    testl %esi, %edi
+; CHECK-NEXT:    testl %esi, %edi
+; CHECK-NEXT:    jne .Ltmp2
+; CHECK-NEXT:    #NO_APP
+; CHECK-NEXT:  .LBB1_4: # %if.else
+; CHECK-NEXT:    jmp .LBB1_2
+; CHECK-NEXT:  .Ltmp1: # Block address taken
+; CHECK-NEXT:  .LBB1_5: # %label_true
+; CHECK-NEXT:    movl $-2, %eax
+; CHECK-NEXT:    jmp .LBB1_6
 entry:
   %cmp = icmp slt i32 %out1, %out2
   br i1 %cmp, label %if.then, label %if.else
@@ -109,7 +116,7 @@ define i32 @test3(i1 %cmp) {
 ; CHECK-NEXT:    .short %esi
 ; CHECK-NEXT:    .short %edi
 ; CHECK-NEXT:    #NO_APP
-; CHECK-NEXT:  .LBB2_2:
+; CHECK-NEXT:  .LBB2_2: # %true
 ; CHECK-NEXT:    movl %edi, %eax
 ; CHECK-NEXT:    jmp .LBB2_5
 ; CHECK-NEXT:  .LBB2_3: # %false
@@ -117,7 +124,7 @@ define i32 @test3(i1 %cmp) {
 ; CHECK-NEXT:    .short %eax
 ; CHECK-NEXT:    .short %edx
 ; CHECK-NEXT:    #NO_APP
-; CHECK-NEXT:  .LBB2_4:
+; CHECK-NEXT:  .LBB2_4: # %false
 ; CHECK-NEXT:    movl %edx, %eax
 ; CHECK-NEXT:  .LBB2_5: # %asm.fallthrough
 ; CHECK-NEXT:    popl %esi
@@ -125,7 +132,11 @@ define i32 @test3(i1 %cmp) {
 ; CHECK-NEXT:    popl %edi
 ; CHECK-NEXT:    .cfi_def_cfa_offset 4
 ; CHECK-NEXT:    retl
-; CHECK-NEXT:  .Ltmp3: # Address of block that was removed by CodeGen
+; CHECK-NEXT:  .Ltmp3: # Block address taken
+; CHECK-NEXT:  .LBB2_6: # %indirect
+; CHECK-NEXT:    .cfi_def_cfa_offset 12
+; CHECK-NEXT:    movl $42, %eax
+; CHECK-NEXT:    jmp .LBB2_5
 entry:
   br i1 %cmp, label %true, label %false
 
@@ -161,13 +172,16 @@ define i32 @test4(i32 %out1, i32 %out2) {
 ; CHECK-NEXT:    testl %ecx, %edx
 ; CHECK-NEXT:    jne .Ltmp5
 ; CHECK-NEXT:    #NO_APP
-; CHECK-NEXT:  .LBB3_2: # %asm.fallthrough2
+; CHECK-NEXT:  .LBB3_2: # %asm.fallthrough
 ; CHECK-NEXT:    addl %edx, %ecx
 ; CHECK-NEXT:    movl %ecx, %eax
+; CHECK-NEXT:    retl
+; CHECK-NEXT:  .Ltmp4: # Block address taken
+; CHECK-NEXT:  .LBB3_3: # %label_true
+; CHECK-NEXT:    movl $-2, %eax
 ; CHECK-NEXT:  .Ltmp5: # Block address taken
-; CHECK-NEXT:  # %bb.3: # %return
+; CHECK-NEXT:  .LBB3_4: # %return
 ; CHECK-NEXT:    retl
-; CHECK-NEXT:  .Ltmp4: # Address of block that was removed by CodeGen
 entry:
   %0 = callbr { i32, i32 } asm sideeffect "testl $0, $0; testl $1, $2; jne ${3:l}", "=r,=r,r,X,X,~{dirflag},~{fpsr},~{flags}"(i32 %out1, i8* blockaddress(@test4, %label_true), i8* blockaddress(@test4, %return))
           to label %asm.fallthrough [label %label_true, label %return]


        


More information about the llvm-commits mailing list