[PATCH] D115615: [MemCpyOpt] Make capture check during call slot optimization more precise

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 13 02:18:11 PST 2021


nikic created this revision.
nikic added reviewers: asbirlea, fhahn, jdoerfert.
Herald added subscribers: JDevlieghere, hiraditya.
nikic requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first.

In particular, I believe that the capture is fine as long as there is no possible use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling. (I tried checking this in alive2, but it also accepts the transform in cases it clearly shouldn't, so I assume it just doesn't model captures yet.)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D115615

Files:
  llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
  llvm/test/Transforms/MemCpyOpt/capturing-func.ll


Index: llvm/test/Transforms/MemCpyOpt/capturing-func.ll
===================================================================
--- llvm/test/Transforms/MemCpyOpt/capturing-func.ll
+++ llvm/test/Transforms/MemCpyOpt/capturing-func.ll
@@ -57,8 +57,7 @@
 ; CHECK-NEXT:    [[PTR1:%.*]] = alloca i8, align 1
 ; CHECK-NEXT:    [[PTR2:%.*]] = alloca i8, align 1
 ; CHECK-NEXT:    call void @llvm.lifetime.start.p0i8(i64 1, i8* [[PTR2]])
-; CHECK-NEXT:    call void @foo(i8* [[PTR2]])
-; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[PTR1]], i8* [[PTR2]], i32 1, i1 false)
+; CHECK-NEXT:    call void @foo(i8* [[PTR1]])
 ; CHECK-NEXT:    call void @llvm.lifetime.end.p0i8(i64 1, i8* [[PTR2]])
 ; CHECK-NEXT:    call void @foo(i8* [[PTR1]])
 ; CHECK-NEXT:    ret void
@@ -79,8 +78,7 @@
 ; CHECK-LABEL: @test_function_end(
 ; CHECK-NEXT:    [[PTR1:%.*]] = alloca i8, align 1
 ; CHECK-NEXT:    [[PTR2:%.*]] = alloca i8, align 1
-; CHECK-NEXT:    call void @foo(i8* [[PTR2]])
-; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[PTR1]], i8* [[PTR2]], i32 1, i1 false)
+; CHECK-NEXT:    call void @foo(i8* [[PTR1]])
 ; CHECK-NEXT:    ret void
 ;
   %ptr1 = alloca i8
Index: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
===================================================================
--- llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+++ llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
@@ -945,11 +945,44 @@
       return false;
   }
 
-  // Check that src isn't captured by the called function since the
-  // transformation can cause aliasing issues in that case.
+  // Check whether src is captured by the called function, in which case there
+  // may be further indirect uses of src.
+  bool SrcIsCaptured = false;
   for (unsigned ArgI = 0, E = C->arg_size(); ArgI != E; ++ArgI)
     if (C->getArgOperand(ArgI) == cpySrc && !C->doesNotCapture(ArgI))
-      return false;
+      SrcIsCaptured = true;
+
+  // If src is captured, then check whether there are any potential uses of
+  // src through the captured pointer before the lifetime of src ends, either
+  // due to a lifetime.end or a return from the function.
+  if (SrcIsCaptured) {
+    MemoryLocation SrcLoc =
+        MemoryLocation(srcAlloca, LocationSize::precise(srcSize));
+    for (Instruction &I : make_range(++C->getIterator(),
+                                     C->getParent()->end())) {
+      // Lifetime of srcAlloca ends at lifetime.end.
+      if (auto *II = dyn_cast<IntrinsicInst>(&I)) {
+        if (II->getIntrinsicID() == Intrinsic::lifetime_end &&
+            II->getArgOperand(1)->stripPointerCasts() == srcAlloca)
+          break;
+      }
+
+      // Lifetime of srcAlloca ends at return.
+      if (isa<ReturnInst>(&I))
+        break;
+
+      // Ignore the direct read of src in the load.
+      if (&I == cpyLoad)
+        continue;
+
+      // Check whether this instruction may mod/ref src through the captured
+      // pointer (we have already any direct mod/refs in the loop above).
+      // Also bail if we hit a terminator, as we don't want to scan into other
+      // blocks.
+      if (isModOrRefSet(AA->getModRefInfo(&I, SrcLoc)) || I.isTerminator())
+        return false;
+    }
+  }
 
   // Since we're changing the parameter to the callsite, we need to make sure
   // that what would be the new parameter dominates the callsite.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D115615.393817.patch
Type: text/x-patch
Size: 3353 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211213/09382e6d/attachment.bin>


More information about the llvm-commits mailing list