[llvm] 8d2885c - [GuardWidening] Improve analysis of potential widening into hotter block

Sun Mar 19 22:52:25 PDT 2023

Author: Max Kazantsev
Date: 2023-03-20T12:51:27+07:00
New Revision: 8d2885c2ef98b81927c1f816691ec4e77cfc7f3f

URL: https://github.com/llvm/llvm-project/commit/8d2885c2ef98b81927c1f816691ec4e77cfc7f3f
DIFF: https://github.com/llvm/llvm-project/commit/8d2885c2ef98b81927c1f816691ec4e77cfc7f3f.diff

LOG: [GuardWidening] Improve analysis of potential widening into hotter block

There is a piece of logic in GuardWidening which is very limited, and it happens
to ignore implicit control flow, therefore it "works fine" with guards expressed as
intrinsic calls. However, when the guards are represented as branches, its limitations
create a lot of trouble.

The intent here is to make sure that we do not widen code across complex CFG,
so that it can end up being in hotter code than it used to be. The old logic was limited
to unique immediate successor and that's it.

This patch changes the logic there to work the following way: when we need to check
if we can widen from `DominatedBlock` into `DominatingBlock`, we first try to find the
lowest (by CFG) transitive successor  of `DominatingBlock` which is guaranteed to not
be significantly colder than the `DominatingBlock`. It means that every time we move
to either:

- Unique successor of the current block, if it only has one successor;
- The only taken successor, if the current block ends with `br(const)`;
- If one of successors ends with deopt, another one is taken;

If the lowest block we can find this way is the `DominatedBlock`, then it is safe to assume
that this widening won't move the code into a hotter location.

I did not touch the existing check with PDT. It looks fishy to me (post-dominance doesn't
really guarantee anything about hotness), but let's keep it as is and maybe fix later.

With this patch, Guard Widening can widen explicitly expressed branches across more than one
dominating guard if it's profitable.

Differential Revision: https://reviews.llvm.org/D146276
Reviewed By: skatkov

Added: 
    

Modified: 
    llvm/lib/Transforms/Scalar/GuardWidening.cpp
    llvm/test/Transforms/GuardWidening/two_forms_behavior_consistency.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Scalar/GuardWidening.cpp b/llvm/lib/Transforms/Scalar/GuardWidening.cpp
index 064c7b1d0ad12..e354b70357cd9 100644

--- a/llvm/lib/Transforms/Scalar/GuardWidening.cpp
+++ b/llvm/lib/Transforms/Scalar/GuardWidening.cpp
@@ -460,27 +460,55 @@ GuardWideningImpl::computeWideningScore(Instruction *DominatedInstr,
   if (HoistingOutOfLoop)
     return WS_Positive;
 
-  // Returns true if we might be hoisting above explicit control flow.  Note
-  // that this completely ignores implicit control flow (guards, calls which
-  // throw, etc...).  That choice appears arbitrary.
-  auto MaybeHoistingOutOfIf = [&]() {
-    auto *DominatingBlock = DominatingGuard->getParent();
-    auto *DominatedBlock = DominatedInstr->getParent();
-    if (isGuardAsWidenableBranch(DominatingGuard))
-      DominatingBlock = cast<BranchInst>(DominatingGuard)->getSuccessor(0);
+  // For a given basic block \p BB, return its successor which is guaranteed or
+  // highly likely will be taken as its successor.
+  auto GetLikelySuccessor = [](const BasicBlock * BB)->const BasicBlock * {
+    if (auto *UniqueSucc = BB->getUniqueSuccessor())
+      return UniqueSucc;
+    auto *Term = BB->getTerminator();
+    Value *Cond = nullptr;
+    const BasicBlock *IfTrue = nullptr, *IfFalse = nullptr;
+    using namespace PatternMatch;
+    if (!match(Term, m_Br(m_Value(Cond), m_BasicBlock(IfTrue),
+                          m_BasicBlock(IfFalse))))
+      return nullptr;
+    // For constant conditions, only one dynamical successor is possible
+    if (auto *ConstCond = dyn_cast<ConstantInt>(Cond))
+      return ConstCond->isAllOnesValue() ? IfTrue : IfFalse;
+    // If one of successors ends with deopt, another one is likely.
+    if (IfFalse->getPostdominatingDeoptimizeCall())
+      return IfTrue;
+    if (IfTrue->getPostdominatingDeoptimizeCall())
+      return IfFalse;
+    // TODO: Use branch frequency metatada to allow hoisting through non-deopt
+    // branches?
+    return nullptr;
+  };
+
+  // Returns true if we might be hoisting above explicit control flow into a
+  // considerably hotter block.  Note that this completely ignores implicit
+  // control flow (guards, calls which throw, etc...).  That choice appears
+  // arbitrary (we assume that implicit control flow exits are all rare).
+  auto MaybeHoistingToHotterBlock = [&]() {
+    const auto *DominatingBlock = DominatingGuard->getParent();
+    const auto *DominatedBlock = DominatedInstr->getParent();
+
+    // Descent as low as we can, always taking the likely successor.
+    while (DominatedBlock != DominatingBlock)
+      if (auto *LikelySucc = GetLikelySuccessor(DominatingBlock))
+        DominatingBlock = LikelySucc;
+      else
+        break;
 
     // Same Block?
     if (DominatedBlock == DominatingBlock)
       return false;
-    // Obvious successor (common loop header/preheader case)
-    if (DominatedBlock == DominatingBlock->getUniqueSuccessor())
-      return false;
     // TODO: diamond, triangle cases
     if (!PDT) return true;
     return !PDT->dominates(DominatedBlock, DominatingBlock);
   };
 
-  return MaybeHoistingOutOfIf() ? WS_IllegalOrNegative : WS_Neutral;
+  return MaybeHoistingToHotterBlock() ? WS_IllegalOrNegative : WS_Neutral;
 }
 
 bool GuardWideningImpl::canBeHoistedTo(

diff  --git a/llvm/test/Transforms/GuardWidening/two_forms_behavior_consistency.ll b/llvm/test/Transforms/GuardWidening/two_forms_behavior_consistency.ll
index 69bddeea9fbb8..9b60f4e3e62b7 100644
--- a/llvm/test/Transforms/GuardWidening/two_forms_behavior_consistency.ll
+++ b/llvm/test/Transforms/GuardWidening/two_forms_behavior_consistency.ll
@@ -42,30 +42,26 @@ define void @test_01(i32 %a, i32 %b, i32 %c, i32 %d) {
 ; BRANCH_FORM-NEXT:  entry:
 ; BRANCH_FORM-NEXT:    br label [[LOOP:%.*]]
 ; BRANCH_FORM:       loop:
-; BRANCH_FORM-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[GUARDED5:%.*]] ]
+; BRANCH_FORM-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[GUARDED:%.*]] ]
 ; BRANCH_FORM-NEXT:    [[IV_NEXT]] = add i32 [[IV]], 1
 ; BRANCH_FORM-NEXT:    [[C1:%.*]] = icmp ult i32 [[IV]], [[A]]
 ; BRANCH_FORM-NEXT:    [[C2:%.*]] = icmp ult i32 [[IV]], [[B]]
 ; BRANCH_FORM-NEXT:    [[WIDE_CHK:%.*]] = and i1 [[C1]], [[C2]]
+; BRANCH_FORM-NEXT:    [[C3:%.*]] = icmp ult i32 [[IV]], [[C]]
+; BRANCH_FORM-NEXT:    [[WIDE_CHK13:%.*]] = and i1 [[WIDE_CHK]], [[C3]]
+; BRANCH_FORM-NEXT:    [[C4:%.*]] = icmp ult i32 [[IV]], [[D]]
+; BRANCH_FORM-NEXT:    [[WIDE_CHK14:%.*]] = and i1 [[WIDE_CHK13]], [[C4]]
 ; BRANCH_FORM-NEXT:    [[WIDENABLE_COND:%.*]] = call i1 @llvm.experimental.widenable.condition()
-; BRANCH_FORM-NEXT:    [[EXIPLICIT_GUARD_COND:%.*]] = and i1 [[WIDE_CHK]], [[WIDENABLE_COND]]
-; BRANCH_FORM-NEXT:    br i1 [[EXIPLICIT_GUARD_COND]], label [[GUARDED:%.*]], label [[DEOPT:%.*]], !prof [[PROF0:![0-9]+]]
+; BRANCH_FORM-NEXT:    [[EXIPLICIT_GUARD_COND:%.*]] = and i1 [[WIDE_CHK14]], [[WIDENABLE_COND]]
+; BRANCH_FORM-NEXT:    br i1 [[EXIPLICIT_GUARD_COND]], label [[GUARDED]], label [[DEOPT:%.*]], !prof [[PROF0:![0-9]+]]
 ; BRANCH_FORM:       deopt:
 ; BRANCH_FORM-NEXT:    call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
 ; BRANCH_FORM-NEXT:    ret void
 ; BRANCH_FORM:       guarded:
 ; BRANCH_FORM-NEXT:    [[WIDENABLE_COND3:%.*]] = call i1 @llvm.experimental.widenable.condition()
 ; BRANCH_FORM-NEXT:    [[EXIPLICIT_GUARD_COND4:%.*]] = and i1 [[C2]], [[WIDENABLE_COND3]]
-; BRANCH_FORM-NEXT:    [[C3:%.*]] = icmp ult i32 [[IV]], [[C]]
-; BRANCH_FORM-NEXT:    [[C4:%.*]] = icmp ult i32 [[IV]], [[D]]
-; BRANCH_FORM-NEXT:    [[WIDE_CHK13:%.*]] = and i1 [[C3]], [[C4]]
 ; BRANCH_FORM-NEXT:    [[WIDENABLE_COND7:%.*]] = call i1 @llvm.experimental.widenable.condition()
-; BRANCH_FORM-NEXT:    [[EXIPLICIT_GUARD_COND8:%.*]] = and i1 [[WIDE_CHK13]], [[WIDENABLE_COND7]]
-; BRANCH_FORM-NEXT:    br i1 [[EXIPLICIT_GUARD_COND8]], label [[GUARDED5]], label [[DEOPT6:%.*]], !prof [[PROF0]]
-; BRANCH_FORM:       deopt6:
-; BRANCH_FORM-NEXT:    call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
-; BRANCH_FORM-NEXT:    ret void
-; BRANCH_FORM:       guarded5:
+; BRANCH_FORM-NEXT:    [[EXIPLICIT_GUARD_COND8:%.*]] = and i1 [[C3]], [[WIDENABLE_COND7]]
 ; BRANCH_FORM-NEXT:    [[WIDENABLE_COND11:%.*]] = call i1 @llvm.experimental.widenable.condition()
 ; BRANCH_FORM-NEXT:    [[EXIPLICIT_GUARD_COND12:%.*]] = and i1 [[C4]], [[WIDENABLE_COND11]]
 ; BRANCH_FORM-NEXT:    [[LOOP_COND:%.*]] = call i1 @cond()
@@ -78,30 +74,26 @@ define void @test_01(i32 %a, i32 %b, i32 %c, i32 %d) {
 ; BRANCH_FORM_LICM-NEXT:  entry:
 ; BRANCH_FORM_LICM-NEXT:    br label [[LOOP:%.*]]
 ; BRANCH_FORM_LICM:       loop:
-; BRANCH_FORM_LICM-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[GUARDED5:%.*]] ]
+; BRANCH_FORM_LICM-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[GUARDED:%.*]] ]
 ; BRANCH_FORM_LICM-NEXT:    [[IV_NEXT]] = add i32 [[IV]], 1
 ; BRANCH_FORM_LICM-NEXT:    [[C1:%.*]] = icmp ult i32 [[IV]], [[A]]
 ; BRANCH_FORM_LICM-NEXT:    [[C2:%.*]] = icmp ult i32 [[IV]], [[B]]
 ; BRANCH_FORM_LICM-NEXT:    [[WIDE_CHK:%.*]] = and i1 [[C1]], [[C2]]
+; BRANCH_FORM_LICM-NEXT:    [[C3:%.*]] = icmp ult i32 [[IV]], [[C]]
+; BRANCH_FORM_LICM-NEXT:    [[WIDE_CHK13:%.*]] = and i1 [[WIDE_CHK]], [[C3]]
+; BRANCH_FORM_LICM-NEXT:    [[C4:%.*]] = icmp ult i32 [[IV]], [[D]]
+; BRANCH_FORM_LICM-NEXT:    [[WIDE_CHK14:%.*]] = and i1 [[WIDE_CHK13]], [[C4]]
 ; BRANCH_FORM_LICM-NEXT:    [[WIDENABLE_COND:%.*]] = call i1 @llvm.experimental.widenable.condition()
-; BRANCH_FORM_LICM-NEXT:    [[EXIPLICIT_GUARD_COND:%.*]] = and i1 [[WIDE_CHK]], [[WIDENABLE_COND]]
-; BRANCH_FORM_LICM-NEXT:    br i1 [[EXIPLICIT_GUARD_COND]], label [[GUARDED:%.*]], label [[DEOPT:%.*]], !prof [[PROF0:![0-9]+]]
+; BRANCH_FORM_LICM-NEXT:    [[EXIPLICIT_GUARD_COND:%.*]] = and i1 [[WIDE_CHK14]], [[WIDENABLE_COND]]
+; BRANCH_FORM_LICM-NEXT:    br i1 [[EXIPLICIT_GUARD_COND]], label [[GUARDED]], label [[DEOPT:%.*]], !prof [[PROF0:![0-9]+]]
 ; BRANCH_FORM_LICM:       deopt:
 ; BRANCH_FORM_LICM-NEXT:    call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
 ; BRANCH_FORM_LICM-NEXT:    ret void
 ; BRANCH_FORM_LICM:       guarded:
 ; BRANCH_FORM_LICM-NEXT:    [[WIDENABLE_COND3:%.*]] = call i1 @llvm.experimental.widenable.condition()
 ; BRANCH_FORM_LICM-NEXT:    [[EXIPLICIT_GUARD_COND4:%.*]] = and i1 [[C2]], [[WIDENABLE_COND3]]
-; BRANCH_FORM_LICM-NEXT:    [[C3:%.*]] = icmp ult i32 [[IV]], [[C]]
-; BRANCH_FORM_LICM-NEXT:    [[C4:%.*]] = icmp ult i32 [[IV]], [[D]]
-; BRANCH_FORM_LICM-NEXT:    [[WIDE_CHK13:%.*]] = and i1 [[C3]], [[C4]]
 ; BRANCH_FORM_LICM-NEXT:    [[WIDENABLE_COND7:%.*]] = call i1 @llvm.experimental.widenable.condition()
-; BRANCH_FORM_LICM-NEXT:    [[EXIPLICIT_GUARD_COND8:%.*]] = and i1 [[WIDE_CHK13]], [[WIDENABLE_COND7]]
-; BRANCH_FORM_LICM-NEXT:    br i1 [[EXIPLICIT_GUARD_COND8]], label [[GUARDED5]], label [[DEOPT6:%.*]], !prof [[PROF0]]
-; BRANCH_FORM_LICM:       deopt6:
-; BRANCH_FORM_LICM-NEXT:    call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
-; BRANCH_FORM_LICM-NEXT:    ret void
-; BRANCH_FORM_LICM:       guarded5:
+; BRANCH_FORM_LICM-NEXT:    [[EXIPLICIT_GUARD_COND8:%.*]] = and i1 [[C3]], [[WIDENABLE_COND7]]
 ; BRANCH_FORM_LICM-NEXT:    [[WIDENABLE_COND11:%.*]] = call i1 @llvm.experimental.widenable.condition()
 ; BRANCH_FORM_LICM-NEXT:    [[EXIPLICIT_GUARD_COND12:%.*]] = and i1 [[C4]], [[WIDENABLE_COND11]]
 ; BRANCH_FORM_LICM-NEXT:    [[LOOP_COND:%.*]] = call i1 @cond()