[PATCH] D153453: [MemCpyOpt] implement single BB stack-move optimization which unify the static unescaped allocas

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 10 01:28:27 PDT 2023


nikic added inline comments.


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1427
   return true;
 }
+// Attempts to optimize the pattern whereby memory is copied from an alloca to
----------------
nit: Missing newline


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1429
+// Attempts to optimize the pattern whereby memory is copied from an alloca to
+// another alloca, where the two allocas doesn't have conflicting mod/ref. If
+// successful, the two allocas can be merged into one and the transfer can be
----------------
doesn't -> don't


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1437
+// the lifetime markers of the single merged alloca to the nearest dominating
+// and postdominating basic block. Note that the "shrink wrapping" procedure is
+// a safe transformation only because we restrict the scope of this optimization
----------------
basic block -> instruction? At least for now.


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1449
+  // PostDominator and we would be able to relax the condition for each blocks.
+  if (SrcAlloca->getParent() != DestAlloca->getParent()) {
+    LLVM_DEBUG(dbgs() << "Stack Move: src and dest allocas are not in the "
----------------
This is implied by isStaticAlloca(), which requires allocas to be in the entry block. The interesting block limitation is the check during the use loop.


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1476
+
+  // 2-1. Check that src and dest are static allocas, which is irrelevant to
+  // stacksave/stackrestore.
----------------
which is irrelevant to -> which are not affected by


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1482
+  // 2-2. Check that src and dest are never captured, unescaped allocas. Also
+  // collect lifetime markers first/last users for shrink wrap the lifetimes,
+  // and instructions with noalias metadata to remove them.
----------------
for -> in order to


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1501
+      if (GEP->isInBounds())
+        return true;
+    bool CanBeNull, CanBeFreed;
----------------
Drop the inbounds handling code, it's no longer correct. Actually, it would be fine to just pass nullptr instead of IsDereferenceableOrNull, we don't really care about this case.


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1505
+  };
+  SmallVector<Instruction *, 8> Worklist;
+  Worklist.push_back(DestAlloca);
----------------
There should also be a Visited set, otherwise we can go into an infinite loop if phi nodes are involved.

We can then also limit the Visited set size using getDefaultMaxUsesToExploreForCaptureTracking().


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1516
+          return false;
+          continue;
+        case UseCaptureKind::PASSTHROUGH:
----------------
Unnecessary continue after return.


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1532
+          if (UI->isLifetimeStartOrEnd()) {
+            // We note these locations of these intrinsic calls so that we can
+            // delete them later if the optimization succeeds, this is safe
----------------
these locations -> the locations


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1554
+  // the alloca to the Store.
+  bool DestMod = false, DestRef = false;
+  MemoryLocation DestLoc(DestAlloca, LocationSize::precise(Size));
----------------
I would probably keep these in a ModRefInfo variable rather than splitting into bool.


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1589
+    return true;
+  };
+
----------------
When checking dest, you allow ref before the store. Here you allow any mod/ref before the store. That means you could have ref of dest and mod of src. After merging, the ref of dst might read the mod of src now. If I'm following the logic right, this wouldn't be correct.

TBH I don't think there's a need to allow ref of dst before the store. As there can't be any mod, this means the ref will always read uninitialized memory. This doesn't seem like a useful case to handle, as those refs can be optimized away.


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1610
+  ConstantInt *AllocaSize =
+      cast<ConstantInt>(ConstantInt::get(Type::getInt64Ty(C), Size));
+  // Create a new lifetime start marker before the first user of src or alloca
----------------
I don't think this cast is necessary.


================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1619
+  Builder.SetInsertPoint(LastUser->getParent(), ++LastUser->getIterator());
+  Builder.CreateLifetimeEnd(SrcAlloca, AllocaSize);
+
----------------
As discussed, it's better to only add lifetime markers if there originally were any.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153453/new/

https://reviews.llvm.org/D153453



More information about the llvm-commits mailing list