[polly] [polly] Add profitability check for expanded region. (PR #96548)

Thu Aug 1 16:38:48 PDT 2024

================
@@ -1749,6 +1760,84 @@ bool ScopDetection::isProfitableRegion(DetectionContext &Context) const {
   return invalid<ReportUnprofitable>(Context, /*Assert=*/true, &CurRegion);
 }
 
+bool ScopDetection::isRegionExpansionProfitable(const Region &ExpandedRegion,
+                                                LoopInfo &LI) const {
+  if (!RegionExpansionProfitabilityCheck)
+    return true;
+
+  POLLY_DEBUG(dbgs() << "\nChecking expanded region: "
+                     << ExpandedRegion.getNameStr() << "\n");
+
+  // Collect outermost loops from expanded region.
+  SmallPtrSet<const Loop *, 2> OutermostLoops;
+  for (BasicBlock *BB : ExpandedRegion.blocks()) {
+    Loop *L = ExpandedRegion.outermostLoopInRegion(&LI, BB);
+    if (L)
+      OutermostLoops.insert(L);
+  }
+
+  if (OutermostLoops.empty()) {
+    POLLY_DEBUG(dbgs() << "Unprofitable expanded region: no loops found.\n");
+    return false;
+  }
+
+  // Return region expansion as unprofitable, if it contains basic blocks with
+  // memory accesses not used in outermost loops of the expanded region.
+  for (BasicBlock *BB : ExpandedRegion.blocks()) {
+    if (&BB->front() == BB->getTerminator())
+      continue;
+    if (BB == ExpandedRegion.getEntry())
+      continue;
+
+    // Only consider the expansion blocks added in addition to the loops. Also
+    // ignore preheader blocks because they may contain loop invariant loads.
+    if (llvm::any_of(OutermostLoops, [&](const Loop *L) {
+          return L->contains(BB) || (BB == L->getLoopPreheader());
+        }))
+      continue;
+
+    // Check if a basic block has instruction that access memory, but not used
+    // in any outermost loops of the expanded region.
+    bool BBContainsUnrelatedMemAccesses =
+        llvm::any_of(*BB, [&](const Instruction &I) {
+          if (!I.mayReadOrWriteMemory())
+            return false;
+          if (I.user_empty())
+            return false;
----------------
huihzhang wrote:

I agree on skipping any trailing or leading non-loop blocks.

Previously, I check for memory accesses, because non-loop blocks with no memory access doesn't cause much trouble
 for loop to be analyzed, although there is no use of adding these non-loop blocks either.

If non-loop blocks contain any memory accesses, yes, the address of these memory accesses would matter more than the value being loaded.

https://github.com/llvm/llvm-project/pull/96548