[llvm] [FuncSpec] Update function specialization to handle phi-chains (PR #72903)

Mats Petersson via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 22 02:29:08 PST 2023


https://github.com/Leporacanthicus updated https://github.com/llvm/llvm-project/pull/72903

>From c263ffa159ad8cb656c52155cb067722883c3015 Mon Sep 17 00:00:00 2001
From: Mats Petersson <mats.petersson at arm.com>
Date: Mon, 20 Nov 2023 18:56:45 +0000
Subject: [PATCH 1/3] [FuncSpec] Update function specialization to handle
 phi-chains

When using the LLVM flang compiler with alias analysis (AA) enabled,
SPEC2017:548.exchange2_r was running significantly slower than
wihtout the AA.

This was caused by the GVN pass replacing many of the loads in the
pre-AA code with phi-nodes that form a long chain of dependencies,
which the function specialization was unable to follow.

This adds a function to discover phi-nodes in a transitive set, with
some limitations to avoid spending ages analysing phi-nodes.

The minimum latency savings also had to be lowered - fewer load
instructions means less saving.

Adding some more prints to help debugging the isProfitable decision.

No significant change in compile time or generated code-size.

Co-authored-by: Alexandros Lamprineas <alexandros.lamprineas at arm.com>
---
 .../Transforms/IPO/FunctionSpecialization.h   |   4 +
 .../Transforms/IPO/FunctionSpecialization.cpp | 119 +++++++++++++++---
 .../discover-transitive-phis.ll               |  87 +++++++++++++
 .../phi-nodes-non-constfoldable.ll            |  54 ++++++++
 4 files changed, 250 insertions(+), 14 deletions(-)
 create mode 100644 llvm/test/Transforms/FunctionSpecialization/discover-transitive-phis.ll
 create mode 100644 llvm/test/Transforms/FunctionSpecialization/phi-nodes-non-constfoldable.ll

diff --git a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
index 50f9aae73dc53e2..882bda0f4cdefe6 100644
--- a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
+++ b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
@@ -217,6 +217,10 @@ class InstCostVisitor : public InstVisitor<InstCostVisitor, Constant *> {
   Cost estimateSwitchInst(SwitchInst &I);
   Cost estimateBranchInst(BranchInst &I);
 
+  bool discoverTransitivelyIncomingValues(
+      Constant *Const, PHINode *Root, DenseSet<PHINode *> &TransitivePHIs,
+      SmallVectorImpl<PHINode *> &UnknownIncomingValues);
+
   Constant *visitInstruction(Instruction &I) { return nullptr; }
   Constant *visitPHINode(PHINode &I);
   Constant *visitFreezeInst(FreezeInst &I);
diff --git a/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp b/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
index b75ca7761a60b62..a0a912ecdb897a6 100644
--- a/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
@@ -39,10 +39,17 @@ static cl::opt<unsigned> MaxClones(
     "The maximum number of clones allowed for a single function "
     "specialization"));
 
+static cl::opt<unsigned>
+    MaxDiscoveryIterations("funcspec-max-discovery-iterations", cl::init(100),
+                           cl::Hidden,
+                           cl::desc("The maximum number of iterations allowed "
+                                    "when searching for transitive "
+                                    "phis"));
+
 static cl::opt<unsigned> MaxIncomingPhiValues(
-    "funcspec-max-incoming-phi-values", cl::init(4), cl::Hidden, cl::desc(
-    "The maximum number of incoming values a PHI node can have to be "
-    "considered during the specialization bonus estimation"));
+    "funcspec-max-incoming-phi-values", cl::init(8), cl::Hidden,
+    cl::desc("The maximum number of incoming values a PHI node can have to be "
+             "considered during the specialization bonus estimation"));
 
 static cl::opt<unsigned> MaxBlockPredecessors(
     "funcspec-max-block-predecessors", cl::init(2), cl::Hidden, cl::desc(
@@ -64,9 +71,9 @@ static cl::opt<unsigned> MinCodeSizeSavings(
     "much percent of the original function size"));
 
 static cl::opt<unsigned> MinLatencySavings(
-    "funcspec-min-latency-savings", cl::init(70), cl::Hidden, cl::desc(
-    "Reject specializations whose latency savings are less than this"
-    "much percent of the original function size"));
+    "funcspec-min-latency-savings", cl::init(40), cl::Hidden,
+    cl::desc("Reject specializations whose latency savings are less than this"
+             "much percent of the original function size"));
 
 static cl::opt<unsigned> MinInliningBonus(
     "funcspec-min-inlining-bonus", cl::init(300), cl::Hidden, cl::desc(
@@ -262,29 +269,113 @@ Cost InstCostVisitor::estimateBranchInst(BranchInst &I) {
   return estimateBasicBlocks(WorkList);
 }
 
+bool InstCostVisitor::discoverTransitivelyIncomingValues(
+    Constant *Const, PHINode *Root, DenseSet<PHINode *> &TransitivePHIs,
+    SmallVectorImpl<PHINode *> &UnknownIncomingValues) {
+
+  SmallVector<PHINode *, 64> WorkList;
+  WorkList.push_back(Root);
+  unsigned Iter = 0;
+
+  while (!WorkList.empty()) {
+    PHINode *PN = WorkList.pop_back_val();
+
+    if (++Iter > MaxDiscoveryIterations ||
+        PN->getNumIncomingValues() > MaxIncomingPhiValues) {
+      // For now just collect the Phi and later we will check whether it is
+      // in the Transitive set.
+      UnknownIncomingValues.push_back(PN);
+      continue;
+      // FIXME: return false here and remove the UnknownIncomingValues entirely.
+    }
+
+    if (!TransitivePHIs.insert(PN).second)
+      continue;
+
+    for (unsigned I = 0, E = PN->getNumIncomingValues(); I != E; ++I) {
+      Value *V = PN->getIncomingValue(I);
+
+      // Disregard self-references and dead incoming values.
+      if (auto *Inst = dyn_cast<Instruction>(V))
+        if (Inst == PN || DeadBlocks.contains(PN->getIncomingBlock(I)))
+          continue;
+
+      if (Constant *C = findConstantFor(V, KnownConstants)) {
+        // Not all incoming values are the same constant. Bail immediately.
+        if (C != Const)
+          return false;
+        continue;
+      }
+
+      if (auto *Phi = dyn_cast<PHINode>(V)) {
+        WorkList.push_back(Phi);
+        continue;
+      }
+
+      // We can't reason about anything else.
+      return false;
+    }
+  }
+  return true;
+}
+
 Constant *InstCostVisitor::visitPHINode(PHINode &I) {
   if (I.getNumIncomingValues() > MaxIncomingPhiValues)
     return nullptr;
 
   bool Inserted = VisitedPHIs.insert(&I).second;
+  SmallVector<PHINode *, 8> UnknownIncomingValues;
+  DenseSet<PHINode *> TransitivePHIs;
   Constant *Const = nullptr;
+  bool HaveSeenIncomingPHI = false;
 
   for (unsigned Idx = 0, E = I.getNumIncomingValues(); Idx != E; ++Idx) {
     Value *V = I.getIncomingValue(Idx);
+
+    // Disregard self-references and dead incoming values.
     if (auto *Inst = dyn_cast<Instruction>(V))
       if (Inst == &I || DeadBlocks.contains(I.getIncomingBlock(Idx)))
         continue;
-    Constant *C = findConstantFor(V, KnownConstants);
-    if (!C) {
-      if (Inserted)
-        PendingPHIs.push_back(&I);
-      return nullptr;
+
+    if (Constant *C = findConstantFor(V, KnownConstants)) {
+      if (!Const)
+        Const = C;
+      // Not all incoming values are the same constant. Bail immediately.
+      if (C != Const)
+        return nullptr;
+      continue;
     }
-    if (!Const)
-      Const = C;
-    else if (C != Const)
+
+    if (Inserted) {
+      // First time we are seeing this phi. We will retry later, after
+      // all the constant arguments have been propagated. Bail for now.
+      PendingPHIs.push_back(&I);
       return nullptr;
+    }
+
+    if (isa<PHINode>(V)) {
+      // Perhaps it is a Transitive Phi. We will confirm later.
+      HaveSeenIncomingPHI = true;
+      continue;
+    }
+
+    // We can't reason about anything else.
+    return nullptr;
   }
+
+  assert(Const && "Should have found at least one constant incoming value");
+
+  if (!HaveSeenIncomingPHI)
+    return Const;
+
+  if (!discoverTransitivelyIncomingValues(Const, &I, TransitivePHIs,
+                                          UnknownIncomingValues))
+    return nullptr;
+
+  for (PHINode *Phi : UnknownIncomingValues)
+    if (!TransitivePHIs.contains(Phi))
+      return nullptr;
+
   return Const;
 }
 
diff --git a/llvm/test/Transforms/FunctionSpecialization/discover-transitive-phis.ll b/llvm/test/Transforms/FunctionSpecialization/discover-transitive-phis.ll
new file mode 100644
index 000000000000000..7e359d22bcfae8a
--- /dev/null
+++ b/llvm/test/Transforms/FunctionSpecialization/discover-transitive-phis.ll
@@ -0,0 +1,87 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+;
+; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -S < %s | FileCheck %s --check-prefix=FUNCSPEC
+; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -funcspec-max-discovery-iterations=11 -S < %s | FileCheck %s --check-prefix=NOFUNCSPEC
+
+define i64 @bar(i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10) {
+; FUNCSPEC-LABEL: define i64 @bar(
+; FUNCSPEC-SAME: i1 [[C1:%.*]], i1 [[C2:%.*]], i1 [[C3:%.*]], i1 [[C4:%.*]], i1 [[C5:%.*]], i1 [[C6:%.*]], i1 [[C7:%.*]], i1 [[C8:%.*]], i1 [[C9:%.*]], i1 [[C10:%.*]]) {
+; FUNCSPEC-NEXT:  entry:
+; FUNCSPEC-NEXT:    [[F1:%.*]] = call i64 @foo.specialized.1(i64 3, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]], i1 [[C5]], i1 [[C6]], i1 [[C7]], i1 [[C8]], i1 [[C9]], i1 [[C10]]), !range [[RNG0:![0-9]+]]
+; FUNCSPEC-NEXT:    [[F2:%.*]] = call i64 @foo.specialized.2(i64 4, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]], i1 [[C5]], i1 [[C6]], i1 [[C7]], i1 [[C8]], i1 [[C9]], i1 [[C10]]), !range [[RNG1:![0-9]+]]
+; FUNCSPEC-NEXT:    [[ADD:%.*]] = add nuw nsw i64 [[F1]], [[F2]]
+; FUNCSPEC-NEXT:    ret i64 [[ADD]]
+;
+; NOFUNCSPEC-LABEL: define i64 @bar(
+; NOFUNCSPEC-SAME: i1 [[C1:%.*]], i1 [[C2:%.*]], i1 [[C3:%.*]], i1 [[C4:%.*]], i1 [[C5:%.*]], i1 [[C6:%.*]], i1 [[C7:%.*]], i1 [[C8:%.*]], i1 [[C9:%.*]], i1 [[C10:%.*]]) {
+; NOFUNCSPEC-NEXT:  entry:
+; NOFUNCSPEC-NEXT:    [[F1:%.*]] = call i64 @foo(i64 3, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]], i1 [[C5]], i1 [[C6]], i1 [[C7]], i1 [[C8]], i1 [[C9]], i1 [[C10]]), !range [[RNG0:![0-9]+]]
+; NOFUNCSPEC-NEXT:    [[F2:%.*]] = call i64 @foo(i64 4, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]], i1 [[C5]], i1 [[C6]], i1 [[C7]], i1 [[C8]], i1 [[C9]], i1 [[C10]]), !range [[RNG0]]
+; NOFUNCSPEC-NEXT:    [[ADD:%.*]] = add nuw nsw i64 [[F1]], [[F2]]
+; NOFUNCSPEC-NEXT:    ret i64 [[ADD]]
+;
+entry:
+  %f1 = call i64 @foo(i64 3, i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10)
+  %f2 = call i64 @foo(i64 4, i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10)
+  %add = add i64 %f1, %f2
+  ret i64 %add
+}
+
+define internal i64 @foo(i64 %n, i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10) {
+entry:
+  br i1 %c1, label %l1, label %l9
+
+l1:
+  %phi1 = phi i64 [ %n, %entry ], [ %phi2, %l2 ]
+  %add = add i64 %phi1, 1
+  %div = sdiv i64 %add, 2
+  br i1 %c2, label %l1_5, label %exit
+
+l1_5:
+  br i1 %c3, label %l1_75, label %l6
+
+l1_75:
+  br i1 %c4, label %l2, label %l3
+
+l2:
+  %phi2 = phi i64 [ %phi1, %l1_75 ], [ %phi3, %l3 ]
+  br label %l1
+
+l3:
+  %phi3 = phi i64 [ %phi1, %l1_75 ], [ %phi4, %l4 ]
+  br label %l2
+
+l4:
+  %phi4 = phi i64 [ %phi5, %l5 ], [ %phi6, %l6 ]
+  br i1 %c5, label %l3, label %l6
+
+l5:
+  %phi5 = phi i64 [ %phi6, %l6_5 ], [ %phi7, %l7 ]
+  br label %l4
+
+l6:
+  %phi6 = phi i64 [ %phi4, %l4 ], [ %phi1, %l1_5 ]
+  br i1 %c6, label %l4, label %l6_5
+
+l6_5:
+  br i1 %c7, label %l5, label %l8
+
+l7:
+  %phi7 = phi i64 [ %phi9, %l9 ], [ %phi8, %l8 ]
+  br i1 %c8, label %l5, label %l8
+
+l8:
+  %phi8 = phi i64 [ %phi6, %l6_5 ], [ %phi7, %l7 ]
+  br i1 %c9, label %l7, label %l9
+
+l9:
+  %phi9 = phi i64 [ %n, %entry ], [ %phi8, %l8 ]
+  %sub = sub i64 %phi9, 1
+  %mul = mul i64 %sub, 2
+  br i1 %c10, label %l7, label %exit
+
+exit:
+  %res = phi i64 [ %div, %l1 ], [ %mul, %l9]
+  ret i64 %res
+}
+
diff --git a/llvm/test/Transforms/FunctionSpecialization/phi-nodes-non-constfoldable.ll b/llvm/test/Transforms/FunctionSpecialization/phi-nodes-non-constfoldable.ll
new file mode 100644
index 000000000000000..877997a70505794
--- /dev/null
+++ b/llvm/test/Transforms/FunctionSpecialization/phi-nodes-non-constfoldable.ll
@@ -0,0 +1,54 @@
+; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -S < %s
+
+define i64 @bar(i1 %c1, i1 %c2, i1 %c3, i1 %c4, i64 %x1) {
+; CHECK-LABEL: define i64 @bar(
+; CHECK-SAME: i1 [[C1:%.*]], i1 [[C2:%.*]], i1 [[C3:%.*]], i1 [[C4:%.*]], i64 [[X1:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[F1:%.*]] = call i64 @foo(i64 3, i64 4, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]])
+; CHECK-NEXT:    [[F2:%.*]] = call i64 @foo(i64 4, i64 [[X1]], i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]])
+; CHECK-NEXT:    [[F3:%.*]] = call i64 @foo.specialized.1(i64 3, i64 3, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]])
+; CHECK-NEXT:    [[ADD:%.*]] = add i64 [[F1]], [[F2]]
+; CHECK-NEXT:    [[ADD2:%.*]] = add i64 [[ADD]], [[F3]]
+; CHECK-NEXT:    ret i64 [[ADD2]]
+;
+entry:
+  %f1 = call i64 @foo(i64 3, i64 4, i1 %c1, i1 %c2, i1 %c3, i1 %c4)
+  %f2 = call i64 @foo(i64 4, i64 %x1, i1 %c1, i1 %c2, i1 %c3, i1 %c4)
+  %f3 = call i64 @foo(i64 3, i64 3, i1 %c1, i1 %c2, i1 %c3, i1 %c4)
+  %add = add i64 %f1, %f2
+  %add2 = add i64 %add, %f3
+  ret i64 %add2
+}
+
+define internal i64 @foo(i64 %n, i64 %m, i1 %c1, i1 %c2, i1 %c3, i1 %c4) {
+entry:
+  br i1 %c1, label %l1, label %l4
+
+l1:
+  %phi1 = phi i64 [ %n, %entry ], [ %phi2, %l2 ]
+  %add = add i64 %phi1, 1
+  %div = sdiv i64 %add, 2
+  br i1 %c2, label %l1_5, label %exit
+
+l1_5:
+  br i1 %c3, label %l2, label %l3
+
+l2:
+  %phi2 = phi i64 [ %phi1, %l1_5 ], [ %phi3, %l3 ]
+  br label %l1
+
+l3:
+  %phi3 = phi i64 [ %phi1, %l1_5 ], [ %m, %l4 ]
+  br i1 %c2, label %l4, label %l2
+
+l4:
+  %phi4 = phi i64 [ %n, %entry ], [ %phi3, %l3 ]
+  %sub = sub i64 %phi4, 1
+  %mul = mul i64 %sub, 2
+  br i1 %c4, label %l3, label %exit
+
+exit:
+  %res = phi i64 [ %div, %l1 ], [ %mul, %l4]
+  ret i64 %res
+}
+

>From c7856dd32995e1e27d9a4db848697f59b8775561 Mon Sep 17 00:00:00 2001
From: Mats Petersson <mats.petersson at arm.com>
Date: Tue, 21 Nov 2023 17:58:58 +0000
Subject: [PATCH 2/3] Review updates

---
 .../Transforms/IPO/FunctionSpecialization.h   | 20 +++++++--
 .../Transforms/IPO/FunctionSpecialization.cpp | 25 ++++-------
 .../discover-transitive-phis.ll               |  2 +-
 .../phi-nodes-can-constfold.ll                | 42 +++++++++++++++++++
 .../phi-nodes-non-constfoldable.ll            |  2 +-
 5 files changed, 68 insertions(+), 23 deletions(-)
 create mode 100644 llvm/test/Transforms/FunctionSpecialization/phi-nodes-can-constfold.ll

diff --git a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
index 882bda0f4cdefe6..93df7ee7f342fca 100644
--- a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
+++ b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
@@ -217,9 +217,23 @@ class InstCostVisitor : public InstVisitor<InstCostVisitor, Constant *> {
   Cost estimateSwitchInst(SwitchInst &I);
   Cost estimateBranchInst(BranchInst &I);
 
-  bool discoverTransitivelyIncomingValues(
-      Constant *Const, PHINode *Root, DenseSet<PHINode *> &TransitivePHIs,
-      SmallVectorImpl<PHINode *> &UnknownIncomingValues);
+  // Transitive Incoming Values are chains of PHI Nodes that
+  // may all refer to the same value.
+  //
+  // For example:
+  //
+  // %a = load %0
+  // %c = phi [%a, %d]
+  // %d = phi [%e, %c]
+  // %e = phi [%c, %f]
+  // %f = phi [%j, %h]
+  // %j = phi [%h, %j]
+  // %h = phi [%g, %c]
+  //
+  // In the real world, there would be branches and other code between these
+  // phi-nodes.
+  bool discoverTransitivelyIncomingValues(Constant *Const, PHINode *Root,
+                                          DenseSet<PHINode *> &TransitivePHIs);
 
   Constant *visitInstruction(Instruction &I) { return nullptr; }
   Constant *visitPHINode(PHINode &I);
diff --git a/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp b/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
index a0a912ecdb897a6..a4c12006ee2433a 100644
--- a/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
@@ -270,8 +270,7 @@ Cost InstCostVisitor::estimateBranchInst(BranchInst &I) {
 }
 
 bool InstCostVisitor::discoverTransitivelyIncomingValues(
-    Constant *Const, PHINode *Root, DenseSet<PHINode *> &TransitivePHIs,
-    SmallVectorImpl<PHINode *> &UnknownIncomingValues) {
+    Constant *Const, PHINode *Root, DenseSet<PHINode *> &TransitivePHIs) {
 
   SmallVector<PHINode *, 64> WorkList;
   WorkList.push_back(Root);
@@ -281,13 +280,8 @@ bool InstCostVisitor::discoverTransitivelyIncomingValues(
     PHINode *PN = WorkList.pop_back_val();
 
     if (++Iter > MaxDiscoveryIterations ||
-        PN->getNumIncomingValues() > MaxIncomingPhiValues) {
-      // For now just collect the Phi and later we will check whether it is
-      // in the Transitive set.
-      UnknownIncomingValues.push_back(PN);
-      continue;
-      // FIXME: return false here and remove the UnknownIncomingValues entirely.
-    }
+        PN->getNumIncomingValues() > MaxIncomingPhiValues)
+      return false;
 
     if (!TransitivePHIs.insert(PN).second)
       continue;
@@ -324,8 +318,6 @@ Constant *InstCostVisitor::visitPHINode(PHINode &I) {
     return nullptr;
 
   bool Inserted = VisitedPHIs.insert(&I).second;
-  SmallVector<PHINode *, 8> UnknownIncomingValues;
-  DenseSet<PHINode *> TransitivePHIs;
   Constant *Const = nullptr;
   bool HaveSeenIncomingPHI = false;
 
@@ -363,19 +355,16 @@ Constant *InstCostVisitor::visitPHINode(PHINode &I) {
     return nullptr;
   }
 
-  assert(Const && "Should have found at least one constant incoming value");
+  if (!Const)
+    return nullptr;
 
   if (!HaveSeenIncomingPHI)
     return Const;
 
-  if (!discoverTransitivelyIncomingValues(Const, &I, TransitivePHIs,
-                                          UnknownIncomingValues))
+  DenseSet<PHINode *> TransitivePHIs;
+  if (!discoverTransitivelyIncomingValues(Const, &I, TransitivePHIs))
     return nullptr;
 
-  for (PHINode *Phi : UnknownIncomingValues)
-    if (!TransitivePHIs.contains(Phi))
-      return nullptr;
-
   return Const;
 }
 
diff --git a/llvm/test/Transforms/FunctionSpecialization/discover-transitive-phis.ll b/llvm/test/Transforms/FunctionSpecialization/discover-transitive-phis.ll
index 7e359d22bcfae8a..b4c24715037bcaf 100644
--- a/llvm/test/Transforms/FunctionSpecialization/discover-transitive-phis.ll
+++ b/llvm/test/Transforms/FunctionSpecialization/discover-transitive-phis.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ;
 ; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -S < %s | FileCheck %s --check-prefix=FUNCSPEC
-; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -funcspec-max-discovery-iterations=11 -S < %s | FileCheck %s --check-prefix=NOFUNCSPEC
+; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -funcspec-max-discovery-iterations=16 -S < %s | FileCheck %s --check-prefix=NOFUNCSPEC
 
 define i64 @bar(i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10) {
 ; FUNCSPEC-LABEL: define i64 @bar(
diff --git a/llvm/test/Transforms/FunctionSpecialization/phi-nodes-can-constfold.ll b/llvm/test/Transforms/FunctionSpecialization/phi-nodes-can-constfold.ll
new file mode 100644
index 000000000000000..5865b5492e1f54d
--- /dev/null
+++ b/llvm/test/Transforms/FunctionSpecialization/phi-nodes-can-constfold.ll
@@ -0,0 +1,42 @@
+; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=10 -funcspec-for-literal-constant -S < %s | FileCheck %s
+
+define i64 @bar(i1 %c1, i1 %c2, i1 %c3, i1 %c4, i64 %x1) {
+; CHECK-LABEL: define i64 @bar(
+; CHECK-SAME: i1 [[C1:%.*]], i1 [[C2:%.*]], i1 [[C3:%.*]], i1 [[C4:%.*]], i64 [[X1:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[F1:%.*]] = call i64 @foo.specialized.1(i64 3, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]])
+; CHECK-NEXT:    [[F2:%.*]] = call i64 @foo(i64 [[X1]], i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]])
+; CHECK-NEXT:    [[ADD:%.*]] = add i64 [[F1]], [[F2]]
+; CHECK-NEXT:    ret i64 [[ADD]]
+;
+entry:
+  %f1 = call i64 @foo(i64 3, i1 %c1, i1 %c2, i1 %c3, i1 %c4)
+  %f2 = call i64 @foo(i64 %x1, i1 %c1, i1 %c2, i1 %c3, i1 %c4)
+  %add = add i64 %f1, %f2
+  ret i64 %add
+}
+
+define internal i64 @foo(i64 %n, i1 %c1, i1 %c2, i1 %c3, i1 %c4) {
+entry:
+  br label %l0
+  
+l1:
+  %phi1 = phi i64 [ %phi0, %l0 ], [ %phi2, %l2 ]
+  %add = add i64 %phi1, 1
+  %div = sdiv i64 %add, 2
+  br i1 %c2, label %l2, label %exit
+
+l2:
+  %phi2 = phi i64 [ %phi0, %l0 ], [ %phi1, %l1 ]
+  %sub = sub i64 %phi2, 1
+  %mul = mul i64 %sub, 2
+  br i1 %c4, label %l1, label %exit
+
+l0:
+  %phi0 = phi i64 [ %n, %entry ]
+  br i1 %c1, label %l1, label %l2
+
+exit:
+  %res = phi i64 [ %div, %l1 ], [ %mul, %l2]
+  ret i64 %res
+}
diff --git a/llvm/test/Transforms/FunctionSpecialization/phi-nodes-non-constfoldable.ll b/llvm/test/Transforms/FunctionSpecialization/phi-nodes-non-constfoldable.ll
index 877997a70505794..11b71d6667b985b 100644
--- a/llvm/test/Transforms/FunctionSpecialization/phi-nodes-non-constfoldable.ll
+++ b/llvm/test/Transforms/FunctionSpecialization/phi-nodes-non-constfoldable.ll
@@ -1,4 +1,4 @@
-; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -S < %s
+; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=10 -funcspec-for-literal-constant -S < %s | FileCheck %s
 
 define i64 @bar(i1 %c1, i1 %c2, i1 %c3, i1 %c4, i64 %x1) {
 ; CHECK-LABEL: define i64 @bar(

>From e0d6e68f389dff0e345a239a9190ccbb638230c7 Mon Sep 17 00:00:00 2001
From: Mats Petersson <mats.petersson at arm.com>
Date: Wed, 22 Nov 2023 10:19:27 +0000
Subject: [PATCH 3/3] Improve comment wording

---
 .../Transforms/IPO/FunctionSpecialization.h   | 22 +++++++++----------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
index 93df7ee7f342fca..b001771951e0fe5 100644
--- a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
+++ b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
@@ -217,21 +217,19 @@ class InstCostVisitor : public InstVisitor<InstCostVisitor, Constant *> {
   Cost estimateSwitchInst(SwitchInst &I);
   Cost estimateBranchInst(BranchInst &I);
 
-  // Transitive Incoming Values are chains of PHI Nodes that
-  // may all refer to the same value.
+  // Transitively Incoming Values (TIV) is a set of Values that can "feed" a
+  // value to the initial PHI-node. It is defined like this:
   //
-  // For example:
+  // * the initial PHI-node belongs to TIV.
   //
-  // %a = load %0
-  // %c = phi [%a, %d]
-  // %d = phi [%e, %c]
-  // %e = phi [%c, %f]
-  // %f = phi [%j, %h]
-  // %j = phi [%h, %j]
-  // %h = phi [%g, %c]
+  // * for every PHI-node in TIV, its operands belong to TIV
   //
-  // In the real world, there would be branches and other code between these
-  // phi-nodes.
+  // If TIV for the initial PHI-node (P) contains more than one constant or a
+  // value that is not a PHI-node, then P cannot be folded to a constant.
+  //
+  // As soon as we detect these cases, we bail, without constructing the
+  // full TIV.
+  // Otherwise P can be folded to the one constant in TIV.
   bool discoverTransitivelyIncomingValues(Constant *Const, PHINode *Root,
                                           DenseSet<PHINode *> &TransitivePHIs);
 



More information about the llvm-commits mailing list