[PATCH] D115615: [MemCpyOpt] Make capture check during call slot optimization more precise

Tue Dec 21 09:27:08 PST 2021

reames added a comment.

I don't see the original bug your fixing in the test cases, and your explanation isn't clear to me.  Can you expand on that point a bit?

On your extension, I think there might be a useful generalization here.  As near as I can tell, your bit of code is effectively a local no-capture analysis.  Your reasoning could be phrased as "if the captured memory can't be accessed in a well defined manner before the end of the lifetime of the captured storage, it can't actually have been captured".  Right?

Assuming that's correct, what about doing this in DSE and annotating the call param no capture directly?  Shouldn't the backwards walk DSE does on dead allocas be enough to annotate these cases?  If so, MemCpyOpt could then simply fix the bug, and we could get generally strongly nocapture reasoning everywhere.

Your modref check is the only bit I'm not sure ports over naturally.  It depends on how important that case is to you.

================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:966
+        if (II->getIntrinsicID() == Intrinsic::lifetime_end &&
+            II->getArgOperand(1)->stripPointerCasts() == srcAlloca)
+          break;
----------------
I believe you need to check the size of the end here as well.  You could have a zero sized end, which I believe is a noop.  

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115615/new/

https://reviews.llvm.org/D115615