[PATCH] D134152: [SimplifyCFG][TranformUtils]Do not simplify away a trivial basic block if both this block and at least one of its predecessors are loop latches.

Mingming Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Sep 18 18:14:10 PDT 2022


mingmingl created this revision.
Herald added a subscriber: hiraditya.
Herald added a project: All.
mingmingl requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

- Before this patch, loop metadata (if exists) will override the metadata of each predecessor; if the predecessor block already has loop metadata, the orignal loop metadata won't be preserved (see test case).

Two alternative options are considered but they won't work well:

1. Partially forwards 'BB->pred' to 'BB->succ', in other words, only do forwarding for predecessors of BB that are not loop latches.

  In this case, 'BB' might be partially duplicated (e.g., the PHI instructions) into 'BB->succ' for non-loop-latch pred, and meanwhile still preserved for loop-latch pred. 'BB->succ' has more predecesors.

2. Proceed with deleting 'BB' when loop metadata of BB and Pred could be concatenated correctly (e.g. loop properties doesn't conflict, and the concatenation won't changing loop transformation result due to pass ordering, etc).
  1. Firstly, multiple transformations in the same metadata are ambigious -> the transformation result depend on execution order of different loop passes. Say 'BB' and 'BB->pred' are outer and inner loop respectively, concatenating them correctly for different scenarios is tricky.
  2. In a subset of cases when concat is correct, a) if the inner-most loop of 'BB' and inner-most loop of Pred are the same loop, LoopSimplify (scheduled before each LoopPass by pass managers) will ensure there is only exactly one loop latch. So it's better to let LoopSimplify simplify the CFG. b) If 'BB' is from an outer loop and 'Pred' is from an inner loop, then 'BB' is an exit block of the inner-loop with predecessors that are out of the inner-loop. As a result, LoopSimplify will create dedicated exits for the inner loop, which has the same effect of adding 'BB' back. By then it's ambiguious how to undo the concatenation of loop metadata.

    In practice, the SimplifyCFG pass could run many times in an optimizer pipeline. When SimplifyCFG run after all loops passes complete, loops might have been simplified away (e.g. completely unrolled). In these cases, this function could still kick in and simplify trival 'BB' away.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D134152

Files:
  llvm/lib/Transforms/Utils/Local.cpp
  llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll


Index: llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll
===================================================================
--- llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll
+++ llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll
@@ -1,5 +1,5 @@
-; RUN: opt -opaque-pointers -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -keep-loops=false -S < %s | FileCheck %s
-; RUN: opt -opaque-pointers -passes='simplifycfg<no-keep-loops>' -S < %s | FileCheck %s
+; RUN: opt -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -keep-loops=false -S < %s | FileCheck %s
+; RUN: opt -passes='simplifycfg<no-keep-loops>' -S < %s | FileCheck %s
 
 define void @test1(i32 %n) #0 {
 entry:
@@ -71,7 +71,7 @@
 ;    }
 ;    return sum;
 ; }
-define i32 @test2(i32 %a, i32 %b, i32 %step, i32 %remainder, ptr %input) {
+define i32 @test2(i32 %a, i32 %b, i32 %step, i32 %remainder, i32* %input) {
 entry:
   br label %while.cond
 
@@ -93,8 +93,8 @@
   %k.07 = phi i32 [ 0, %while.body ], [ %inc, %for.body ]
   %add2 = add nsw i32 %k.07, %add
   %idxprom = sext i32 %add2 to i64
-  %arrayidx = getelementptr inbounds i32, ptr %input, i64 %idxprom
-  %0 = load i32, ptr %arrayidx, align 4
+  %arrayidx = getelementptr inbounds i32, i32* %input, i64 %idxprom
+  %0 = load i32, i32* %arrayidx, align 4
   %1 = tail call i32 asm sideeffect "add ${0:w}, ${1:w}\0A", "=r,r,~{cc}"(i32 %0)
   %inc = add nuw nsw i32 %k.07, 1
   %cmp1 = icmp ult i32 %inc, 5
@@ -113,4 +113,5 @@
 !5 = !{!"llvm.loop.unroll.enable"}
 ; CHECK: !0 = distinct !{!0, !1}
 ; CHECK: !1 = !{!"llvm.loop.distribute.enable", i1 true}
-; CHECK-NOT: !{!"llvm.loop.unroll.enable"}
+; CHECK: !4 = distinct !{!4, !3, !5}
+; CHECK: !5 = !{!"llvm.loop.unroll.enable"}
Index: llvm/lib/Transforms/Utils/Local.cpp
===================================================================
--- llvm/lib/Transforms/Utils/Local.cpp
+++ llvm/lib/Transforms/Utils/Local.cpp
@@ -1090,6 +1090,47 @@
     }
   }
 
+  // If the unconditional branch we replaced contains llvm.loop metadata
+  // and any of its predecessor is also a loop latch, bail out directly.
+  //
+  // Two alternative options that won't work well:
+  // 1. Partially forwards 'BB->pred' to 'BB->succ', only for predecessors of BB
+  // that are not loop latches.
+  //    In this case, 'BB' might be partially duplicated (e.g., the PHI
+  //    instructions) into 'BB->succ' for non-loop-latch pred, and meanwhile
+  //    still preserved for loop-latch pred. 'BB->succ' has more predecesors.
+  // 2. Proceed with deleting 'BB' when loop metadata of BB and Pred could be
+  // concatenated correctly (e.g. loop properties doesn't conflict, and the
+  // concatenation won't changing loop transformation result due to pass
+  // ordering, etc).
+  //    1) Firstly, multiple transformations in the same metadata are ambigious
+  //    -> the transformation result depend on execution order of different loop
+  //    passes. Say 'BB' and 'BB->pred' are outer and inner loop respectively,
+  //    concatenating them correctly for different scenarios is tricky. 2) In a
+  //    subset of cases when concat is correct,
+  //       a) if the inner-most loop of 'BB' and inner-most loop of Pred are the
+  //       same loop, LoopSimplify (scheduled before each LoopPass by pass
+  //       managers) will ensure there is only exactly one loop latch. So it's
+  //       better to let LoopSimplify simplify the CFG. b) If 'BB' is from an
+  //       outer loop and 'Pred' is from an inner loop, then 'BB' is an exit
+  //       block of the inner-loop with predecessors that are out of the
+  //       inner-loop. As a result, LoopSimplify will create dedicated exits for
+  //       the inner loop, which has the same effect of adding 'BB' back. By
+  //       then it's ambiguious how to undo the concatenation of loop metadata.
+  //
+  // In practice, the SimplifyCFG pass could run many times in an optimizer
+  // pipeline. When SimplifyCFG run after all loops passes complete, loops might
+  // have been simplified away (e.g. completely unrolled). In these cases, this
+  // function could still kick in and simplify trival 'BB' away.
+  if (Instruction *TI = BB->getTerminator())
+    if (MDNode *LoopMD = TI->getMetadata(LLVMContext::MD_loop)) {
+      for (BasicBlock *Pred : predecessors(BB)) {
+        if (Instruction *PredTI = Pred->getTerminator())
+          if (MDNode *PredLoopMD = PredTI->getMetadata(LLVMContext::MD_loop))
+            return false;
+      }
+    }
+
   LLVM_DEBUG(dbgs() << "Killing Trivial BB: \n" << *BB);
 
   SmallVector<DominatorTree::UpdateType, 32> Updates;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D134152.461111.patch
Type: text/x-patch
Size: 4648 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220919/c232fead/attachment-0001.bin>


More information about the llvm-commits mailing list