[polly] [polly] Add profitability check for expanded region. (PR #96548)

Mon Jun 24 13:22:46 PDT 2024

https://github.com/huihzhang created https://github.com/llvm/llvm-project/pull/96548

Region expansion may append a basic block that contains memory accesses not used in loops of original region, or any loops in regions to be joined. These additional memory accesses add complexity in building Scop, runtime alias checks and compute for optimization schedule. In certain cases may make original region unfeasible to analyze, e.g., access to a memory location that was loop invariant for loop in the origional region.

This patch adds a profitability check for region expansion. Mark the expanded region as unprofitable, if it includes basic blocks with memory accesses not used in loops of expanded region.

>From f745c79c8fd0b5e4fd37136a677fa91b1371e640 Mon Sep 17 00:00:00 2001
From: Huihui Zhang <huihuiz at quicinc.com>
Date: Mon, 24 Jun 2024 13:07:07 -0700
Subject: [PATCH] [polly] Add profitability check for expanded region.

Region expansion may append a basic block that contains memory accesses not
used in loops of original region, or any loops in regions to be joined. These
additional memory accesses add complexity in building Scop, runtime alias checks
and compute for optimization schedule. In certain cases may make original region
unfeasible to analyze, e.g., access to a memory location that was loop invariant
for loop in the origional region.

This patch adds a profitability check for region expansion. Mark the expanded
region as unprofitable, if it includes basic blocks with memory accesses not
used in loops of expanded region.
---
 polly/include/polly/ScopDetection.h           |  14 ++
 polly/lib/Analysis/ScopDetection.cpp          |  61 ++++++++
 polly/test/CodeGen/empty_domain_in_context.ll |   3 +-
 ...invariant_loads_ignore_parameter_bounds.ll |   3 +-
 polly/test/CodeGen/multiple-scops-in-a-row.ll |   3 +-
 polly/test/CodeGen/reduction_2.ll             |   4 +-
 .../region_expansion_profitability_check.ll   | 138 ++++++++++++++++++
 .../CodeGen/scev_looking_through_bitcasts.ll  |   3 +-
 .../two-loops-right-after-each-other-2.ll     |   3 +-
 polly/test/DeLICM/skip_notinloop.ll           |   3 +-
 .../ScopDetect/expand-region-correctly-2.ll   |   3 +-
 .../ScopDetect/expand-region-correctly.ll     |   3 +-
 ...ces-loop-scev-with-unknown-iterations-2.ll |   8 +-
 ...ces-loop-scev-with-unknown-iterations-3.ll |   9 +-
 ...ences-loop-scev-with-unknown-iterations.ll |   9 +-
 polly/test/ScopInfo/cfg_consequences.ll       |   3 +-
 .../ScopInfo/complex-successor-structure-3.ll |   3 +-
 .../ScopInfo/complex_execution_context.ll     |   4 +-
 .../ScopInfo/infeasible_invalid_context.ll    |   3 +-
 .../ScopInfo/long-sequence-of-error-blocks.ll |   3 +-
 ...-pure-function-calls-causes-dead-blocks.ll |   3 +-
 .../test/ScopInfo/phi-in-non-affine-region.ll |   3 +-
 polly/test/ScopInfo/phi_after_error_block.ll  |   3 +-
 polly/test/ScopInfo/remarks.ll                |   4 +-
 .../ScopInfo/user_provided_assumptions.ll     |   6 +
 25 files changed, 271 insertions(+), 31 deletions(-)
 create mode 100644 polly/test/CodeGen/region_expansion_profitability_check.ll

diff --git a/polly/include/polly/ScopDetection.h b/polly/include/polly/ScopDetection.h
index 5759f75463284..cf91f1fc6a94b 100644
--- a/polly/include/polly/ScopDetection.h
+++ b/polly/include/polly/ScopDetection.h
@@ -345,6 +345,20 @@ class ScopDetection {
   /// @return True if region is profitable to optimize, false otherwise.
   bool isProfitableRegion(DetectionContext &Context) const;
 
+  /// Check if an expanded region is profitable to optimize.
+  ///
+  /// An expanded region may include basic blocks with memory accesses that
+  /// are not used in loops of the expanded region. These memory accesses add
+  /// complexity for building scop, compute optimization schedule and build
+  /// runtime alias checks. Such expansion is not profitable and should not
+  /// replace original unexpanded region.
+  ///
+  /// @param ExpandedRegion The expanded region to check.
+  ///
+  /// @return True if the expanded region is profitable to optimize.
+  bool isRegionExpansionProfitable(const Region &ExpandedRegion,
+                                   LoopInfo &LI) const;
+
   /// Check if a region is a Scop.
   ///
   /// @param Context The context of scop detection.
diff --git a/polly/lib/Analysis/ScopDetection.cpp b/polly/lib/Analysis/ScopDetection.cpp
index eab7bd83e6a4e..5cce818863960 100644
--- a/polly/lib/Analysis/ScopDetection.cpp
+++ b/polly/lib/Analysis/ScopDetection.cpp
@@ -228,6 +228,11 @@ static cl::opt<bool> PollyAllowErrorBlocks(
     cl::desc("Allow to speculate on the execution of 'error blocks'."),
     cl::Hidden, cl::init(true), cl::cat(PollyCategory));
 
+static cl::opt<bool> RegionExpansionProfitabilityCheck(
+    "polly-region-expansion-profitability-check",
+    cl::desc("Add profitability checks to expanded region"), cl::init(true),
+    cl::Hidden, cl::cat(PollyCategory));
+
 /// The minimal trip count under which loops are considered unprofitable.
 static const unsigned MIN_LOOP_TRIP_COUNT = 8;
 
@@ -1621,6 +1626,12 @@ void ScopDetection::findScops(Region &R) {
     if (!ExpandedR)
       continue;
 
+    if (!isRegionExpansionProfitable(*ExpandedR, LI)) {
+      removeCachedResults(*ExpandedR);
+      delete ExpandedR;
+      continue;
+    }
+
     R.addSubRegion(ExpandedR, true);
     ValidRegions.insert(ExpandedR);
     removeCachedResults(*CurrentRegion);
@@ -1749,6 +1760,56 @@ bool ScopDetection::isProfitableRegion(DetectionContext &Context) const {
   return invalid<ReportUnprofitable>(Context, /*Assert=*/true, &CurRegion);
 }
 
+bool ScopDetection::isRegionExpansionProfitable(const Region &ExpandedRegion,
+                                                LoopInfo &LI) const {
+  if (!RegionExpansionProfitabilityCheck)
+    return true;
+
+  POLLY_DEBUG(dbgs() << "\nChecking expanded region: "
+                     << ExpandedRegion.getNameStr() << "\n");
+
+  // Collect outermost loops from expanded region.
+  SmallPtrSet<const Loop *, 2> Loops;
+  for (auto BB : ExpandedRegion.blocks()) {
+    Loop *L = ExpandedRegion.outermostLoopInRegion(&LI, BB);
+    if (L)
+      Loops.insert(L);
+  }
+
+  if (Loops.size() == 0) {
+    POLLY_DEBUG(dbgs() << "Unprofitable expanded region: no loops found.\n");
+    return false;
+  }
+
+  // Return region expansion as unprofitable, if it contains basic blocks with
+  // memory accesses not used in outermost loops of the expanded region.
+  for (auto BB : ExpandedRegion.blocks()) {
+    if (&BB->front() == BB->getTerminator())
+      continue;
+    if (BB == ExpandedRegion.getEntry())
+      continue;
+
+    // Skip loop preheader block that may contain loop invariant loads.
+    if (llvm::any_of(Loops, [&](const Loop *L) {
+          return L->contains(BB) || (BB == L->getLoopPreheader());
+        }))
+      continue;
+
+    if (llvm::any_of(*BB, [](const Instruction &I) {
+          return isa<LoadInst>(&I) || isa<StoreInst>(&I);
+        })) {
+      POLLY_DEBUG(dbgs() << "Unprofitable expanded region:\n";
+                  dbgs() << "\tBasicBlock: " << BB->getName() << "\n";
+                  dbgs() << "\tcontains memory accesses, but not belong to "
+                            "outermost loops of expanded region.\n\n");
+      return false;
+    }
+  }
+
+  POLLY_DEBUG(dbgs() << "Expanded region seems profitable.\n\n");
+  return true;
+}
+
 bool ScopDetection::isValidRegion(DetectionContext &Context) {
   Region &CurRegion = Context.CurRegion;
 
diff --git a/polly/test/CodeGen/empty_domain_in_context.ll b/polly/test/CodeGen/empty_domain_in_context.ll
index a2fe805f402e0..977e3db2eefc5 100644
--- a/polly/test/CodeGen/empty_domain_in_context.ll
+++ b/polly/test/CodeGen/empty_domain_in_context.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly '-passes=polly-optree,polly-opt-isl,polly-codegen' -S < %s | FileCheck %s
+; RUN: opt %loadNPMPolly '-passes=polly-optree,polly-opt-isl,polly-codegen' \
+; RUN: -polly-region-expansion-profitability-check=0 -S < %s | FileCheck %s
 ;
 ; llvm.org/PR35362
 ; isl codegen does not allow to generate isl_ast_expr from pw_aff which have an
diff --git a/polly/test/CodeGen/invariant_loads_ignore_parameter_bounds.ll b/polly/test/CodeGen/invariant_loads_ignore_parameter_bounds.ll
index 19b30afd33ba7..52dc871b844a6 100644
--- a/polly/test/CodeGen/invariant_loads_ignore_parameter_bounds.ll
+++ b/polly/test/CodeGen/invariant_loads_ignore_parameter_bounds.ll
@@ -1,5 +1,6 @@
 ; RUN: opt %loadNPMPolly -passes=polly-codegen -polly-invariant-load-hoisting \
-; RUN:     -polly-ignore-parameter-bounds -S < %s | FileCheck %s
+; RUN: -polly-ignore-parameter-bounds -polly-region-expansion-profitability-check=0 \
+; RUN: -S < %s | FileCheck %s
 
 ; CHECK: polly.preload.begin:
 ; CHECK-NEXT: %global.load = load i32, ptr @global, align 4, !alias.scope !0, !noalias !3
diff --git a/polly/test/CodeGen/multiple-scops-in-a-row.ll b/polly/test/CodeGen/multiple-scops-in-a-row.ll
index b81ba04e36463..274d2555100ff 100644
--- a/polly/test/CodeGen/multiple-scops-in-a-row.ll
+++ b/polly/test/CodeGen/multiple-scops-in-a-row.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly -S -passes=polly-codegen < %s | FileCheck %s
+; RUN: opt %loadNPMPolly -S -passes=polly-codegen \
+; RUN: -polly-region-expansion-profitability-check=0 < %s | FileCheck %s
 
 ; This test case has two scops in a row. When code generating the first scop,
 ; the second scop is invalidated. This test case verifies that we do not crash
diff --git a/polly/test/CodeGen/reduction_2.ll b/polly/test/CodeGen/reduction_2.ll
index 4aa306775e781..7a359bf2d6865 100644
--- a/polly/test/CodeGen/reduction_2.ll
+++ b/polly/test/CodeGen/reduction_2.ll
@@ -1,4 +1,6 @@
-; RUN: opt %loadNPMPolly -aa-pipeline=basic-aa -polly-invariant-load-hoisting=true '-passes=print<polly-ast>' -disable-output < %s | FileCheck %s --allow-empty
+; RUN: opt %loadNPMPolly -aa-pipeline=basic-aa -polly-invariant-load-hoisting=true \
+; RUN: -polly-region-expansion-profitability-check=0 '-passes=print<polly-ast>' \
+; RUN: -disable-output < %s | FileCheck %s --allow-empty
 
 ;#include <string.h>
 ;#include <stdio.h>
diff --git a/polly/test/CodeGen/region_expansion_profitability_check.ll b/polly/test/CodeGen/region_expansion_profitability_check.ll
new file mode 100644
index 0000000000000..de8d0154e20bb
--- /dev/null
+++ b/polly/test/CodeGen/region_expansion_profitability_check.ll
@@ -0,0 +1,138 @@
+; RUN: opt -polly-use-llvm-names -polly-region-expansion-profitability-check=1 \
+; RUN: "-passes=scop(polly-opt-isl,print<polly-ast>)" -disable-output < %s | \
+; RUN: FileCheck %s --check-prefix=HEURISTIC
+; RUN: opt -polly-region-expansion-profitability-check=0 -passes=polly-codegen \
+; RUN: -pass-remarks-analysis="polly-scops" -disable-output < %s 2>&1 | \
+; RUN: FileCheck %s --check-prefix=NO_HEURISTIC
+
+; void test(int **restrict a, int *restrict b, int *restrict c,
+;           int *restrict d, int *restrict e, int *restrict f,
+;	    int L, int M, int Val0) {
+;   for (int i = 1; i <= L; i++) { // L1
+;     for (int k = 1; k <= M; k++) { // L2: vectorizable loop.
+;       b[k] += e[k-1];
+;       if (k < M)
+;         c[k] += d[k];
+;     }
+;     // Memory accesses to a[i-1],a[i] are not used in L2.
+;     if ((Val0 = a[i-1][4]) > -987654321)
+;      a[i][4] = Val0;
+;   }
+; }
+
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
+
+define dso_local void @test(ptr noalias nocapture noundef readonly %a, ptr noalias nocapture noundef %b, ptr noalias nocapture noundef %c, ptr noalias nocapture noundef readonly %d, ptr noalias nocapture noundef readonly %e, ptr noalias nocapture noundef readnone %f, i32 noundef %L, i32 noundef %M, i32 noundef %Val0) {
+entry:
+  %cmp.not39 = icmp slt i32 %L, 1
+  br i1 %cmp.not39, label %for.cond.cleanup, label %for.cond1.preheader.lr.ph
+
+for.cond1.preheader.lr.ph:                        ; preds = %entry
+  %invariant.gep = getelementptr i8, ptr %e, i64 -4
+  %cmp2.not37 = icmp slt i32 %M, 1
+  br i1 %cmp2.not37, label %for.cond1.preheader.us.preheader, label %for.cond1.preheader.preheader
+
+for.cond1.preheader.preheader:                    ; preds = %for.cond1.preheader.lr.ph
+  %0 = zext nneg i32 %M to i64
+  %1 = add nuw i32 %M, 1
+  %2 = add nuw i32 %L, 1
+  %wide.trip.count46 = zext i32 %2 to i64
+  %wide.trip.count = zext i32 %1 to i64
+  br label %for.cond1.preheader
+
+for.cond1.preheader.us.preheader:                 ; preds = %for.cond1.preheader.lr.ph
+  %3 = add nuw i32 %L, 1
+  %wide.trip.count51 = zext i32 %3 to i64
+  br label %for.cond1.preheader.us
+
+for.cond1.preheader.us:                           ; preds = %for.cond1.preheader.us.preheader, %for.inc23.us
+  %indvars.iv48 = phi i64 [ 1, %for.cond1.preheader.us.preheader ], [ %indvars.iv.next49, %for.inc23.us ]
+  %4 = getelementptr ptr, ptr %a, i64 %indvars.iv48
+  %arrayidx15.us = getelementptr i8, ptr %4, i64 -8
+  %5 = load ptr, ptr %arrayidx15.us, align 8
+  %arrayidx16.us = getelementptr inbounds i8, ptr %5, i64 16
+  %6 = load i32, ptr %arrayidx16.us, align 4
+  %cmp17.us = icmp sgt i32 %6, -987654321
+  br i1 %cmp17.us, label %if.then18.us, label %for.inc23.us
+
+if.then18.us:                                     ; preds = %for.cond1.preheader.us
+  %7 = load ptr, ptr %4, align 8
+  %arrayidx21.us = getelementptr inbounds i8, ptr %7, i64 16
+  store i32 %6, ptr %arrayidx21.us, align 4
+  br label %for.inc23.us
+
+for.inc23.us:                                     ; preds = %if.then18.us, %for.cond1.preheader.us
+  %indvars.iv.next49 = add nuw nsw i64 %indvars.iv48, 1
+  %exitcond52.not = icmp eq i64 %indvars.iv.next49, %wide.trip.count51
+  br i1 %exitcond52.not, label %for.cond.cleanup, label %for.cond1.preheader.us
+
+for.cond1.preheader:                              ; preds = %for.cond1.preheader.preheader, %for.inc23
+  %indvars.iv43 = phi i64 [ 1, %for.cond1.preheader.preheader ], [ %indvars.iv.next44, %for.inc23 ]
+  br label %for.body4
+
+for.cond.cleanup:                                 ; preds = %for.inc23, %for.inc23.us, %entry
+  ret void
+
+for.cond1.for.cond.cleanup3_crit_edge:            ; preds = %for.inc
+  %8 = getelementptr ptr, ptr %a, i64 %indvars.iv43
+  %arrayidx15 = getelementptr i8, ptr %8, i64 -8
+  %9 = load ptr, ptr %arrayidx15, align 8
+  %arrayidx16 = getelementptr inbounds i8, ptr %9, i64 16
+  %10 = load i32, ptr %arrayidx16, align 4
+  %cmp17 = icmp sgt i32 %10, -987654321
+  br i1 %cmp17, label %if.then18, label %for.inc23
+
+for.body4:                                        ; preds = %for.cond1.preheader, %for.inc
+  %indvars.iv = phi i64 [ 1, %for.cond1.preheader ], [ %indvars.iv.next, %for.inc ]
+  %gep = getelementptr i32, ptr %invariant.gep, i64 %indvars.iv
+  %11 = load i32, ptr %gep, align 4
+  %arrayidx6 = getelementptr inbounds i32, ptr %b, i64 %indvars.iv
+  %12 = load i32, ptr %arrayidx6, align 4
+  %add = add nsw i32 %12, %11
+  store i32 %add, ptr %arrayidx6, align 4
+  %cmp7 = icmp ult i64 %indvars.iv, %0
+  br i1 %cmp7, label %if.then, label %for.inc
+
+if.then:                                          ; preds = %for.body4
+  %arrayidx9 = getelementptr inbounds i32, ptr %d, i64 %indvars.iv
+  %13 = load i32, ptr %arrayidx9, align 4
+  %arrayidx11 = getelementptr inbounds i32, ptr %c, i64 %indvars.iv
+  %14 = load i32, ptr %arrayidx11, align 4
+  %add12 = add nsw i32 %14, %13
+  store i32 %add12, ptr %arrayidx11, align 4
+  br label %for.inc
+
+for.inc:                                          ; preds = %for.body4, %if.then
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
+  br i1 %exitcond.not, label %for.cond1.for.cond.cleanup3_crit_edge, label %for.body4
+
+if.then18:                                        ; preds = %for.cond1.for.cond.cleanup3_crit_edge
+  %15 = load ptr, ptr %8, align 8
+  %arrayidx21 = getelementptr inbounds i8, ptr %15, i64 16
+  store i32 %10, ptr %arrayidx21, align 4
+  br label %for.inc23
+
+for.inc23:                                        ; preds = %for.cond1.for.cond.cleanup3_crit_edge, %if.then18
+  %indvars.iv.next44 = add nuw nsw i64 %indvars.iv43, 1
+  %exitcond47.not = icmp eq i64 %indvars.iv.next44, %wide.trip.count46
+  br i1 %exitcond47.not, label %for.cond.cleanup, label %for.cond1.preheader
+}
+
+; Check that region %for.body4---%for.cond1.for.cond.cleanup3_crit_edge is
+; detected with region expansion profitability check.
+; HEURISTIC-LABLE: isl ast :: test :: %for.body4---%for.cond1.for.cond.cleanup3_crit_edge
+; HEURISTIC:      for (int c0 = 0; c0 < M; c0 += 1)
+; HEURISTIC-NEXT:	Stmt_for_body4(c0);
+; HEURISTIC-NEXT: for (int c0 = 0; c0 < M - 1; c0 += 1)
+; HEURISTIC-NEXT:	Stmt_if_then(c0);
+
+; Check that without region expansion heuristic, vectorization is missed.
+; NO_HEURISTIC: SCoP begins here.
+; NO_HEURISTIC-NEXT: No-overflows restriction:        [p_0, p_1] -> {  : p_0 = 2147483647 }
+; NO_HEURISTIC-NEXT: No-overflows restriction:        [p_0, p_1] -> {  : p_0 = 2147483647 }
+; NO_HEURISTIC-NEXT: Possibly aliasing pointer, use restrict keyword.
+; NO_HEURISTIC-NEXT: Possibly aliasing pointer, use restrict keyword.
+; NO_HEURISTIC-NEXT: Invariant load assumption:  [p_0, p_1] -> {  : false }
+; NO_HEURISTIC-NEXT : SCoP ends here but was dismissed.
diff --git a/polly/test/CodeGen/scev_looking_through_bitcasts.ll b/polly/test/CodeGen/scev_looking_through_bitcasts.ll
index 142e83f820fe7..7daeac9235a15 100644
--- a/polly/test/CodeGen/scev_looking_through_bitcasts.ll
+++ b/polly/test/CodeGen/scev_looking_through_bitcasts.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly -passes=polly-codegen -S < %s | FileCheck %s
+; RUN: opt %loadNPMPolly -passes=polly-codegen \
+; RUN: -polly-region-expansion-profitability-check=0 -S < %s | FileCheck %s
 ;
 ; Scalar write of bitcasted value. Instead of writing %b of type
 ; %structty, the SCEV expression looks through the bitcast such that
diff --git a/polly/test/CodeGen/two-loops-right-after-each-other-2.ll b/polly/test/CodeGen/two-loops-right-after-each-other-2.ll
index 1c68389eaeba8..44882d6423710 100644
--- a/polly/test/CodeGen/two-loops-right-after-each-other-2.ll
+++ b/polly/test/CodeGen/two-loops-right-after-each-other-2.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly -passes=polly-codegen -S < %s | FileCheck %s
+; RUN: opt %loadNPMPolly -passes=polly-codegen \
+; RUN: -polly-region-expansion-profitability-check=0 -S < %s | FileCheck %s
 
 ; CHECK:       polly.merge_new_and_old:
 ; CHECK-NEXT:    merge = phi
diff --git a/polly/test/DeLICM/skip_notinloop.ll b/polly/test/DeLICM/skip_notinloop.ll
index 8e265e19aefea..d545c489346f2 100644
--- a/polly/test/DeLICM/skip_notinloop.ll
+++ b/polly/test/DeLICM/skip_notinloop.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-delicm>' -pass-remarks-missed=polly-delicm -disable-output < %s 2>&1 | FileCheck %s
+; RUN: opt %loadNPMPolly '-passes=print<polly-delicm>' -pass-remarks-missed=polly-delicm \
+; RUN: -polly-region-expansion-profitability-check=0 -disable-output < %s 2>&1 | FileCheck %s
 ;
 ;    void func(double *A) {
 ;      double phi = 0.0;
diff --git a/polly/test/ScopDetect/expand-region-correctly-2.ll b/polly/test/ScopDetect/expand-region-correctly-2.ll
index df35d05674f95..2f4437d0265db 100644
--- a/polly/test/ScopDetect/expand-region-correctly-2.ll
+++ b/polly/test/ScopDetect/expand-region-correctly-2.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-detect>' -disable-output < %s 2>&1 | FileCheck %s
+; RUN: opt %loadNPMPolly '-passes=print<polly-detect>' -disable-output \
+; RUN: -polly-region-expansion-profitability-check=0 < %s 2>&1 | FileCheck %s
 ;
 ; CHECK: Valid Region for Scop: if.end.1631 => for.cond.1647.outer
 ;
diff --git a/polly/test/ScopDetect/expand-region-correctly.ll b/polly/test/ScopDetect/expand-region-correctly.ll
index a8c90c08fde0c..421ee75e40541 100644
--- a/polly/test/ScopDetect/expand-region-correctly.ll
+++ b/polly/test/ScopDetect/expand-region-correctly.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-detect>' -disable-output < %s 2>&1 | FileCheck %s
+; RUN: opt %loadNPMPolly '-passes=print<polly-detect>' -disable-output \
+; RUN: -polly-region-expansion-profitability-check=0 < %s 2>&1 | FileCheck %s
 
 ; CHECK: Valid Region for Scop: if.end.1631 => for.cond.1647.outer
 
diff --git a/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations-2.ll b/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations-2.ll
index 83743e4e4ecc7..46c65d604056b 100644
--- a/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations-2.ll
+++ b/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations-2.ll
@@ -1,8 +1,8 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-detect>' -disable-output < %s 2>&1 | \
-; RUN:     FileCheck %s -check-prefix=DETECT
+; RUN: opt %loadNPMPolly '-passes=print<polly-detect>' -disable-output \
+; RUN: -polly-region-expansion-profitability-check=0 < %s 2>&1 | FileCheck %s -check-prefix=DETECT
 
-; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output < %s 2>&1 | \
-; RUN:     FileCheck %s -check-prefix=SCOP
+; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output \
+; RUN: -polly-region-expansion-profitability-check=0 < %s 2>&1 | FileCheck %s -check-prefix=SCOP
 
 ; DETECT: Valid Region for Scop: loop => barrier
 ; DETECT-NEXT: Valid Region for Scop: branch => end
diff --git a/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations-3.ll b/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations-3.ll
index 9685ba37a49a1..7f1d78495e81c 100644
--- a/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations-3.ll
+++ b/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations-3.ll
@@ -1,8 +1,9 @@
-; RUN: opt %loadNPMPolly -polly-stmt-granularity=bb '-passes=print<polly-function-scops>' -disable-output < %s 2>&1 | \
-; RUN:     FileCheck %s -check-prefix=NONAFFINE
+; RUN: opt %loadNPMPolly -polly-stmt-granularity=bb '-passes=print<polly-function-scops>' \
+; RUN: -polly-region-expansion-profitability-check=0 -disable-output < %s 2>&1 | \
+; RUN: FileCheck %s -check-prefix=NONAFFINE
 ; RUN: opt %loadNPMPolly -polly-stmt-granularity=bb '-passes=print<polly-function-scops>' -disable-output \
-; RUN:     -polly-allow-nonaffine-branches=false < %s 2>&1 | \
-; RUN:     FileCheck %s -check-prefix=NO-NONEAFFINE
+; RUN: -polly-region-expansion-profitability-check=0 -polly-allow-nonaffine-branches=false < %s 2>&1 | \
+; RUN: FileCheck %s -check-prefix=NO-NONEAFFINE
 
 ; NONAFFINE:      Statements {
 ; NONAFFINE-NEXT: 	Stmt_loop
diff --git a/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations.ll b/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations.ll
index f41e6500fb30a..8d58968875ff0 100644
--- a/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations.ll
+++ b/polly/test/ScopInfo/branch-references-loop-scev-with-unknown-iterations.ll
@@ -1,8 +1,9 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-detect>,print<polly-function-scops>' -disable-output < %s 2>&1 | \
-; RUN:     FileCheck %s -check-prefix=NONAFFINE
+; RUN: opt %loadNPMPolly '-passes=print<polly-detect>,print<polly-function-scops>' \
+; RUN: -polly-region-expansion-profitability-check=0 -disable-output < %s 2>&1 | \
+; RUN: FileCheck %s -check-prefix=NONAFFINE
 ; RUN: opt %loadNPMPolly '-passes=print<polly-detect>,print<polly-function-scops>' -disable-output \
-; RUN:     -polly-allow-nonaffine-branches=false < %s 2>&1 | \
-; RUN:     FileCheck %s -check-prefix=NO-NONEAFFINE
+; RUN: -polly-region-expansion-profitability-check=0 -polly-allow-nonaffine-branches=false < %s 2>&1 | \
+; RUN: FileCheck %s -check-prefix=NO-NONEAFFINE
 
 ; NONAFFINE-NOT: Statements
 
diff --git a/polly/test/ScopInfo/cfg_consequences.ll b/polly/test/ScopInfo/cfg_consequences.ll
index 9161d3db4167a..70c6a0a720f00 100644
--- a/polly/test/ScopInfo/cfg_consequences.ll
+++ b/polly/test/ScopInfo/cfg_consequences.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output < %s 2>&1 | FileCheck %s
+; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output \
+; RUN: -polly-region-expansion-profitability-check=0 < %s 2>&1 | FileCheck %s
 ;
 ; void consequences(int *A, int bool_cond, int lhs, int rhs) {
 ;
diff --git a/polly/test/ScopInfo/complex-successor-structure-3.ll b/polly/test/ScopInfo/complex-successor-structure-3.ll
index 6da1fe3a8b9f3..1c67114b11979 100644
--- a/polly/test/ScopInfo/complex-successor-structure-3.ll
+++ b/polly/test/ScopInfo/complex-successor-structure-3.ll
@@ -1,5 +1,6 @@
 ; RUN: opt %loadNPMPolly -disable-output '-passes=print<polly-function-scops>' \
-; RUN: -polly-invariant-load-hoisting=true < %s 2>&1 | FileCheck %s
+; RUN: -polly-invariant-load-hoisting=true -polly-region-expansion-profitability-check=0 \
+; RUN: < %s 2>&1 | FileCheck %s
 ;
 ; Check that propagation of domains from A(X) to A(X+1) will keep the
 ; domains small and concise.
diff --git a/polly/test/ScopInfo/complex_execution_context.ll b/polly/test/ScopInfo/complex_execution_context.ll
index 9880a1dd67d19..a90d09d213d16 100644
--- a/polly/test/ScopInfo/complex_execution_context.ll
+++ b/polly/test/ScopInfo/complex_execution_context.ll
@@ -1,6 +1,6 @@
 ; RUN: opt %loadNPMPolly -pass-remarks-analysis="polly-scops" '-passes=print<polly-function-scops>' \
-; RUN:     -polly-invariant-load-hoisting=true \
-; RUN:     -disable-output < %s 2>&1 | FileCheck %s
+; RUN: -polly-region-expansion-profitability-check=0 -polly-invariant-load-hoisting=true \
+; RUN: -disable-output < %s 2>&1 | FileCheck %s
 ;
 ; CHECK: Low complexity assumption:
 ;
diff --git a/polly/test/ScopInfo/infeasible_invalid_context.ll b/polly/test/ScopInfo/infeasible_invalid_context.ll
index 006901ab05b79..2f7308311953c 100644
--- a/polly/test/ScopInfo/infeasible_invalid_context.ll
+++ b/polly/test/ScopInfo/infeasible_invalid_context.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-detect>' -disable-output < %s 2>&1 \
+; RUN: opt %loadNPMPolly '-passes=print<polly-detect>' -disable-output \
+; RUN: -polly-region-expansion-profitability-check=0 < %s 2>&1 \
 ; RUN:  | FileCheck %s -check-prefix=DETECT
 
 ; RUN: opt %loadNPMPolly '-passes=print<polly-detect>,print<polly-function-scops>' -disable-output < %s 2>&1 \
diff --git a/polly/test/ScopInfo/long-sequence-of-error-blocks.ll b/polly/test/ScopInfo/long-sequence-of-error-blocks.ll
index 4ef5ef09c44b7..dff1cf9e6bbdb 100644
--- a/polly/test/ScopInfo/long-sequence-of-error-blocks.ll
+++ b/polly/test/ScopInfo/long-sequence-of-error-blocks.ll
@@ -1,5 +1,6 @@
 ; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output \
-; RUN: -polly-invariant-load-hoisting=true < %s 2>&1 | FileCheck %s
+; RUN: -polly-region-expansion-profitability-check=0 -polly-invariant-load-hoisting=true \
+; RUN: < %s 2>&1 | FileCheck %s
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
diff --git a/polly/test/ScopInfo/non-pure-function-calls-causes-dead-blocks.ll b/polly/test/ScopInfo/non-pure-function-calls-causes-dead-blocks.ll
index 6cbb41041be88..34ad27cf7fbf3 100644
--- a/polly/test/ScopInfo/non-pure-function-calls-causes-dead-blocks.ll
+++ b/polly/test/ScopInfo/non-pure-function-calls-causes-dead-blocks.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output < %s 2>&1 | FileCheck %s
+; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output \
+; RUN: -polly-region-expansion-profitability-check=0 < %s 2>&1 | FileCheck %s
 ;
 ; Error blocks are skipped during SCoP detection. We skip them during
 ; SCoP formation too as they might contain instructions we can not handle.
diff --git a/polly/test/ScopInfo/phi-in-non-affine-region.ll b/polly/test/ScopInfo/phi-in-non-affine-region.ll
index fbbc158b566bb..f7b720e4dd296 100644
--- a/polly/test/ScopInfo/phi-in-non-affine-region.ll
+++ b/polly/test/ScopInfo/phi-in-non-affine-region.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output < %s 2>&1 | FileCheck %s
+; RUN: opt %loadNPMPolly '-passes=print<polly-function-scops>' -disable-output \
+; RUN: -polly-region-expansion-profitability-check=0 < %s 2>&1 | FileCheck %s
 
 ; Verify that 'tmp' is stored in bb1 and read by bb3, as it is needed as
 ; incoming value for the tmp11 PHI node.
diff --git a/polly/test/ScopInfo/phi_after_error_block.ll b/polly/test/ScopInfo/phi_after_error_block.ll
index a1eadff3e9717..2f2346fdacbc1 100644
--- a/polly/test/ScopInfo/phi_after_error_block.ll
+++ b/polly/test/ScopInfo/phi_after_error_block.ll
@@ -1,4 +1,5 @@
-; RUN: opt %loadNPMPolly -polly-stmt-granularity=bb '-passes=print<polly-function-scops>' -disable-output < %s 2>&1 | FileCheck %s
+; RUN: opt %loadNPMPolly -polly-stmt-granularity=bb '-passes=print<polly-function-scops>' \
+; RUN: -polly-region-expansion-profitability-check=0 -disable-output < %s 2>&1 | FileCheck %s
 
 declare void @bar()
 
diff --git a/polly/test/ScopInfo/remarks.ll b/polly/test/ScopInfo/remarks.ll
index 2c173a31c46e9..f979bf2acee83 100644
--- a/polly/test/ScopInfo/remarks.ll
+++ b/polly/test/ScopInfo/remarks.ll
@@ -2,10 +2,12 @@
 ; RUN: -polly-invariant-load-hoisting=true -disable-output < %s 2>&1 | FileCheck %s
 ;
 ; CHECK: remark: test/ScopInfo/remarks.c:4:7: SCoP begins here.
+; CHECK: remark: test/ScopInfo/remarks.c:4:7: No-overflows restriction:    [N, M] -> {  : M <= -2147483649 - N or M >= 2147483648 - N }
+; CHECK: remark: test/ScopInfo/remarks.c:5:13: SCoP ends here.
+; CHECK: remark: test/ScopInfo/remarks.c:7:3: SCoP begins here.
 ; CHECK: remark: test/ScopInfo/remarks.c:9:15: Inbounds assumption:    [N, M, Debug] -> {  : M <= 100 }
 ; CHECK: remark: test/ScopInfo/remarks.c:13:7: No-error restriction:    [N, M, Debug] -> {  : N > 0 and M >= 0 and (Debug < 0 or Debug > 0) }
 ; CHECK: remark: test/ScopInfo/remarks.c:8:5: Finite loop restriction:    [N, M, Debug] -> {  : N > 0 and M < 0 }
-; CHECK: remark: test/ScopInfo/remarks.c:4:7: No-overflows restriction:    [N, M, Debug] -> {  : M <= -2147483649 - N or M >= 2147483648 - N }
 ; CHECK: remark: test/ScopInfo/remarks.c:9:18: Possibly aliasing pointer, use restrict keyword.
 ; CHECK: remark: test/ScopInfo/remarks.c:9:33: Possibly aliasing pointer, use restrict keyword.
 ; CHECK: remark: test/ScopInfo/remarks.c:9:15: Possibly aliasing pointer, use restrict keyword.
diff --git a/polly/test/ScopInfo/user_provided_assumptions.ll b/polly/test/ScopInfo/user_provided_assumptions.ll
index 49b23b1e784dc..206e69c6f343e 100644
--- a/polly/test/ScopInfo/user_provided_assumptions.ll
+++ b/polly/test/ScopInfo/user_provided_assumptions.ll
@@ -4,6 +4,12 @@
 ; CHECK:      remark: <unknown>:0:0: SCoP begins here.
 ; CHECK-NEXT: remark: <unknown>:0:0: Use user assumption: [M, N] -> {  : N <= 2147483647 - M }
 ; CHECK-NEXT: remark: <unknown>:0:0: Use user assumption: [M, N] -> {  : -2147483648 - M <= N <= 2147483647 - M }
+; CHECK-NEXT: remark: <unknown>:0:0: Use user assumption: [M, N] -> {  : 0 < M <= 100 and -2147483648 - M <= N <= 2147483647 - M }
+; CHECK-NEXT: remark: <unknown>:0:0: Use user assumption: [M, N] -> {  : 0 < M <= 100 and 0 < N <= 2147483647 - M }
+; CHECK-NEXT: remark: <unknown>:0:0: SCoP ends here but was dismissed.
+; CHECK-NEXT: remark: <unknown>:0:0: SCoP begins here.
+; CHECK-NEXT: remark: <unknown>:0:0: Use user assumption: [M, N] -> {  : N <= 2147483647 - M }
+; CHECK-NEXT: remark: <unknown>:0:0: Use user assumption: [M, N] -> {  : -2147483648 - M <= N <= 2147483647 - M }
 ; CHECK-NEXT: remark: <unknown>:0:0: Use user assumption: [M, N, Debug] -> {  : Debug = 0 and 0 < M <= 100 and -2147483648 - M <= N <= 2147483647 - M }
 ; CHECK-NEXT: remark: <unknown>:0:0: Use user assumption: [M, N, Debug] -> {  : Debug = 0 and 0 < M <= 100 and 0 < N <= 2147483647 - M }
 ; CHECK-NEXT: remark: <unknown>:0:0: SCoP ends here.