[PATCH] D41400: [MemCpyOpt] Perform call slot optimizations through GEPs

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 21 11:36:51 PST 2017


efriedma added inline comments.


================
Comment at: lib/Transforms/Scalar/MemCpyOptimizer.cpp:836
+  // The store to dest may never happen if the call can throw.
+  if (C->mayThrow() && !isa<AllocaInst>(cpyDest))
+    return false;
----------------
dotdash wrote:
> efriedma wrote:
> > You probably want to clarify why an alloca is special here.
> > 
> > Does it matter if cpyDest could be accessed from another thread?
> > 
> > Do you need to check whether any calls between "C" and "cpy" could throw?
> > You probably want to clarify why an alloca is special here.
> 
> I basically just kept the logic as it was here, but looks like that comment is a bit outdated, and I already made call slot opt a bit too aggressive back when I introduced handling of arguments other than sret ones.
> 
> > Do you need to check whether any calls between "C" and "cpy" could throw?
> 
> I guess so, already to properly handle arguments that are merely dereferenceable and not sret, to avoid stores from showing up that should not be visible after a throw. Would a check for postdominance be sufficient here?
> 
> > Does it matter if cpyDest could be accessed from another thread?
> 
> I'm not completely sure, but I'd say that for multithreaded code to behave correctly here, one would need a memory barrier or something similar here anyway, which should break the dependency between the call and the memcpy, so we should not end up here in the first place. Am I missing something here?
> Would a check for postdominance be sufficient here?

LLVM's PostDominatorTree isn't usable for this, if that's what you're asking.

> for multithreaded code to behave correctly here, one would need a memory barrier or something similar here anyway, which should break the dependency between the call and the memcpy

Oh, okay, that's right.  (There's a similar transform in LICM I was comparing this to, but LICM is more complicated because we promote conditional stores in some cases.)


================
Comment at: lib/Transforms/Scalar/MemCpyOptimizer.cpp:874
   // memcpy can be dropped), that it is not read or written between the call and
   // the memcpy, and that writing beyond the end of it is undefined.
   SmallVector<User*, 8> srcUseList(srcAlloca->user_begin(),
----------------
Sort of orthogonal to the patch, but the check here ("Check that src is not accessed except via the call and the memcpy") doesn't provide the intended guarantee.  If the call is in a loop, src doesn't hold undefined values after the first iteration of the loop.


================
Comment at: lib/Transforms/Scalar/MemCpyOptimizer.cpp:915
+      if (auto G = dyn_cast<GetElementPtrInst>(cpyDest)) {
+        auto P = G->getPointerOperand();
+        auto PI = dyn_cast<Instruction>(P);
----------------
dotdash wrote:
> efriedma wrote:
> > This check is kind of fragile, in that it assumes the other operands to the GEP are all ConstantInts.
> That's a prerequisite to get here, but I see what you mean. I guess I add an explicit `hasAllConstantIndices()` check here?
Sure.  (You could make it an assert if you're confident this code isn't reachable otherwise.)


https://reviews.llvm.org/D41400





More information about the llvm-commits mailing list