[PATCH] D37463: Fix miscompile in LoopSink pass

Wed Sep 27 12:21:37 PDT 2017

hfinkel added a comment.

In https://reviews.llvm.org/D37463#882487, @reames wrote:

> In https://reviews.llvm.org/D37463#867451, @reames wrote:
>
> > I think there might be some further investigation needed for unordered atomics, but I will take that offline with Daniil before returning to this thread.
>
>
> Returning to this point, the conclusion I would reach reviewing our Atomics documentation is that sinking an unordered atomic into a loop must be disallowed, because doing so duplicates the load which is disallowed.
>
> The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section.  Here's the full text of the section under discussion:
>
> Notes for optimizers
>
>   In terms of the optimizer, this *prohibits any transformation that transforms a single load into multiple loads*, transforms a store into multiple stores, narrows a store, or stores a value which would not be stored otherwise. Some examples of unsafe optimizations are narrowing an assignment into a bitfield, *rematerializing a load*, and turning loads and stores into a memcpy call. Reordering unordered operations is safe, though, and optimizers should take advantage of that because unordered operations are common in languages that need them.
>   
>
> The important bits are in *bold*.  The first highlighted clause could be interpreted to be a weaker constraint that it first seems.  In particular, it could be read as only disallowing *splitting* a load into multiple pieces, and not disallowing *duplicating* an load.  Internally, several folks read it that way.  I believe that is an incorrect interpretation and that duplication must also be disallowed by the text.  My reasoning is as follows:
>
> Sinking a load into the header of a loop with a trip count > 1 increases the number of times the load is run dynamically.
>
> Rematerialization (which is explicitly allowed as a disallowed transformation) is the insertion of loads by the register allocator at use sites.  LoopSinking and rematerialization end up producing *exactly the same code* on the following example:
>
> L = load atomic %p
>  while (foo()) {
>
>   call_which_clobbers_all_regs();
>   use L
>
> }
>
> Both transforms would want to produce:
>
> while (foo()) {
>
>   call_which_clobbers_all_regs();
>   L = load atomic %p
>   use L
>
> }
>
> If this were legal for LoopSink, then it would also be legal for rematerialization.  And it's definitely not, as explicitly stated in the text.  As such, it must be the case that LoopSink is disallowed for this case as well, which requires the broader interpretation of the initial sentence.
>
> Given this, I must conclude that LoopSink is currently buggy as we do sink unordered atomic loads into loops.

I agree. Sinking the load into the loop is equivalent to rematerialization.

Repository:
  rL LLVM

https://reviews.llvm.org/D37463