[PATCH] D25794: [SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV

Wed Oct 19 14:20:30 PDT 2016

lihuang created this revision.
lihuang added reviewers: sanjoy, mehdi_amini, mzolotukhin.
lihuang added a subscriber: llvm-commits.

[SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV

This is to solve the same slow-compile issue in SCEV that https://reviews.llvm.org/D3127 was trying to solve. But that patch is no longer active.

Without a threshold for mul ops inlining, getMulExpr could have exponential time complexity in the worst case. The case in PR18606 is an example that causes clang to hang.

However, this patch does not completely solve the slow-compile issue in SCEV. The reason is, not completely flattening the SCEV DAG will make SCEVRewriteVisitor another bottleneck, which visits the same SCEV multiple times and could take exponential time to finish. I will fix this in another patch.


https://reviews.llvm.org/D25794

Files:
  lib/Analysis/ScalarEvolution.cpp
  test/Analysis/ScalarEvolution/max-mulops-inline.ll


Index: lib/Analysis/ScalarEvolution.cpp
===================================================================

--- lib/Analysis/ScalarEvolution.cpp
+++ lib/Analysis/ScalarEvolution.cpp
@@ -121,6 +121,12 @@
                   cl::desc("Verify no dangling value in ScalarEvolution's "
                            "ExprValueMap (slow)"));
 
+static cl::opt<unsigned> MaxMulOpsToInline(
+    "scev-max-mulops-inline", cl::Hidden,
+    cl::desc("Maximum number of multiplication operands to be "
+             "inlined to current multiplication SCEV"),
+    cl::init(1000));
+
 //===----------------------------------------------------------------------===//
 //                           SCEV class definitions
 //===----------------------------------------------------------------------===//
@@ -2505,6 +2511,8 @@
   if (Idx < Ops.size()) {
     bool DeletedMul = false;
     while (const SCEVMulExpr *Mul = dyn_cast<SCEVMulExpr>(Ops[Idx])) {
+      if (Mul->getNumOperands() > MaxMulOpsToInline)
+        break;
       // If we have an mul, expand the mul operands onto the end of the operands
       // list.
       Ops.erase(Ops.begin()+Idx);
Index: test/Analysis/ScalarEvolution/max-mulops-inline.ll
===================================================================
--- test/Analysis/ScalarEvolution/max-mulops-inline.ll
+++ test/Analysis/ScalarEvolution/max-mulops-inline.ll
@@ -0,0 +1,46 @@
+; RUN: opt -analyze -scalar-evolution -scev-max-mulops-inline=1 < %s | FileCheck --check-prefix=CHECK1 %s
+; RUN: opt -analyze -scalar-evolution -scev-max-mulops-inline=10 < %s | FileCheck --check-prefix=CHECK10 %s
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+ at a = local_unnamed_addr global i32 0, align 4
+ at b = local_unnamed_addr global i32 0, align 4
+ at c = local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: norecurse nounwind uwtable
+define i32 @main() local_unnamed_addr {
+
+; CHECK1: %mul.1 = mul nsw i32 %mul, %mul
+; CHECK1: -->  ((%mul.lcssa5 * %mul.lcssa5) * (%mul.lcssa5 * %mul.lcssa5))
+
+; CHECK10: %mul.1 = mul nsw i32 %mul, %mul
+; CHECK10: -->  (%mul.lcssa5 * %mul.lcssa5 * %mul.lcssa5 * %mul.lcssa5)
+
+entry:
+  %a.promoted4 = load i32, i32* @a, align 4
+  br label %for.cond1.preheader
+
+for.cond1.preheader:                              ; preds = %for.body3, %entry
+  %mul.lcssa5 = phi i32 [ %a.promoted4, %entry ], [ %mul.5, %for.body3 ]
+  %i.03 = phi i32 [ 0, %entry ], [ %inc5, %for.body3 ]
+  br label %for.body3
+
+for.body3:                                        ; preds = %for.cond1.preheader
+  %mul = mul nsw i32 %mul.lcssa5, %mul.lcssa5
+  %mul.1 = mul nsw i32 %mul, %mul
+  %mul.2 = mul nsw i32 %mul.1, %mul.1
+  %mul.3 = mul nsw i32 %mul.2, %mul.2
+  %mul.4 = mul nsw i32 %mul.3, %mul.3
+  %mul.5 = mul nsw i32 %mul.4, %mul.4
+  %inc5 = add nsw i32 %i.03, 1
+  %cmp = icmp slt i32 %inc5, 10
+  br i1 %cmp, label %for.cond1.preheader, label %for.end6
+
+for.end6:                                         ; preds = %for.body3
+  %mul.lcssa.lcssa = phi i32 [ %mul.5, %for.body3 ]
+  %inc.lcssa.lcssa = phi i32 [ 6, %for.body3 ]
+  store i32 %mul.lcssa.lcssa, i32* @a, align 4
+  store i32 %inc.lcssa.lcssa, i32* @b, align 4
+  ret i32 0
+}


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D25794.75216.patch
Type: text/x-patch
Size: 3244 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161019/4a0b06a0/attachment.bin>