[PATCH] D28147: [LICM] Allow promotion of some stores that are not guaranteed to execute
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 3 14:11:54 PST 2017
efriedma accepted this revision.
efriedma added a reviewer: efriedma.
efriedma added a comment.
I'm satisfied with this, with some minor tweaks.
================
Comment at: lib/Transforms/Scalar/LICM.cpp:1019
+ return DT->dominates(Store->getParent(), Exit);
+ });
+
----------------
This check is similar to llvm::isGuaranteedToExecute. Maybe worth noting the parallel here.
If the loop throws, you're depending on the earlier check that the memory location is an alloca; you should explicitly note that here.
================
Comment at: test/Transforms/LICM/scalar_promote.ll:222
+ ret i32 %ret
+}
+
----------------
mkuper wrote:
> efriedma wrote:
> > efriedma wrote:
> > > efriedma wrote:
> > > > I'm pretty sure this transform is wrong. For example, `@capture` could create a mutex protecting `%local`, and lock it. Then `@opaque` could release the mutex, does some work, and lock the mutex. In that case, the value stored by `store i32 %x2, i32* %local` is visible to other threads in each iteration, so you can't sink the store out of the loop.
> > > Err, my example is overly complicated. You don't need multiple threads. The address of `%local` escapes, therefore `@opaque` could read it, therefore we have to update it every iteration of the loop.
> > Err, sorry, ignore me, didn't notice the argmemonly annotation on `@opaque`. I need to think a bit.
> I'm not actually changing this behavior.
> The transformation is already performed when @opaque is marked as nounwind. This patch only removes the nounwind requirement, which has no bearing on your scenario.
>
> In any case, I don't think the stores are required to be observable in this scenario (and you also need to pass the mutex as an argument, but, technically, that's possible), so I believe the transformation is still legal. If we reach the conclusion it's not, then it will still be legal if opaque is "argmemonly readonly", since readonly functions are not allowed to perform atomic writes. So we'll need to start requiring readonly (and not just argmemonly), but that's orthogonal to this patch.
>
Mmm... okay, I think I understand what's happening.
>From a memory model perspective, the rule is this: if the current thread performs a non-atomic store to `%local`, no other thread is allowed to read that memory until the current thread performs an atomic release (or equivalent). Therefore, if a loop doesn't contain an atomic release, and every exit is dominated by a non-atomic store, you can sink the store out of the loop. (An atomic release gets modeled by alias analysis as a read to all memory locations whose addresses have escaped, so the existence of the alias set implies the loop doesn't contain a relevant atomic release.)
The wrinkle introduced by a call which throws is that there are additional exits which are not modeled in the CFG; the store might not actually dominate all exits. That said, if a loop throws, we only allow sinking stores to allocas, and allocas get deallocated when the loop throws, so we're still okay.
https://reviews.llvm.org/D28147
More information about the llvm-commits
mailing list