[PATCH] D15592: [LICM] Make store promotion work in the face of unordered atomics

Wed Feb 3 17:19:33 PST 2016

reames added a comment.

In http://reviews.llvm.org/D15592#343586, @hfinkel wrote:

> In http://reviews.llvm.org/D15592#343574, @reames wrote:
>
> > In http://reviews.llvm.org/D15592#342294, @hfinkel wrote:
> >
> > > > Promoting the non-atomic accesses to atomic would be incorrect.
> > >
> > >
> > > Why?
> >
> >
> > Per the example I gave, we might not be able to lower a 128 bit atomic, and the code might be conditional on that fact.  Inserting an unconditional i128 atomic which can't be lowered which wasn't in the original code in that case seems "questionable".  I didn't have a need to do this and the semantics seemed a bit unclear, so it seemed better to avoid.
>
>
> Fair enough, and I'm fine with not doing this. I don't understand your example, however, because if we can't lower the atomic, then the compilation will fail regardless of whether or not the atomic is conditionally executed. We still need to generate code for it if it is present.

My reasoning was that the very next thing the optimizer might do is constant fold the "can lower i128 atomically" branch to false.  The original program would then not contain a atomic i128 (i.e. codegen failure) while a program we promoted to atomic i128 would.  I'm not sure this is strictly speaking legal since the program is relying on optimizer behaviour for correctness.

> > On the topic of infinite loops, I could go either way.  In practice, any infinite loop will have a safepoint or deoptimization point within it, so the loop promotion couldn't trigger anyways.  Given that, I don't see any reason why we'd need to treat potentially infinite loops specially here.  (Again, C++'s rules are not relevant.)

> 

> 

> I understand, and the fact that this can't come up in your environment is interesting. However, from LLVM's perspective, we still need to decide on the semantics here.

I think the answer should be that "unordered" atomics are treated like normal non-atomic instructions in all ways *except* that the actual memory operation is atomic.  Thus, there are no special requirement about infinite loops.

I draw a mental distinction between atomicity, ordering, and visibility.  All of our "atomics" are atomic, but not all of them are ordered.  The visibility rules are essentially unspecified, but in practice, we have to give the "ordered" forms eventual visibility guarantees since that's what the C++ spec requires.

To say this differently, I really don't think we want the optimizer to be in a position where it has to prove a loop is not dynamically infinite before doing store promotion.  That seems like a major complication.  If we later find we need a unordered, atomic, but eventually visible store, I'd rather add that explicitly to the IR.  Having both forms seems potentially useful.

For a code clarity perspective, should we add a predicate like mustBeEventuallyVisible(AtomicOrdering)?  This would give a place to describe the result of this discussion - whatever we eventually settle on - in a way future uses might find.  :)

p.s. Lest it be confusing, our needs on this changed in the not too distant past.  I was at one point arguing that LLVM should allow infinite loops without side effects.  I'm no longer of that belief.  So long as every infinite loop has a side effect (for us, a potential safepoint, deoptimization point, or call), then we do not need any special handling for the unordered atomic stores.

http://reviews.llvm.org/D15592