[PATCH] D56534: [Verifier] Add verification of unaligned atomic load/store

Fri Jan 18 11:19:32 PST 2019

reames added a comment.

In D56534#1362677 <https://reviews.llvm.org/D56534#1362677>, @jyknight wrote:

> >> I'm not sure I agree: those are unordered memsets, which are basically @reames' memsets. I think it's up to him to give them semantics that are useful for his target.
> > 
> > Thank you. I was about to say pretty much the same. In particular, trying to claim that a transform is incorrect and needs reverted based on a proposed reading of a change to the LangRef which has not been posted for review, much less accepted seems to be going a definite step too far.
>
> I must apologize for having given the impression I was actually calling for a revert now. I did not intend that -- we most definitely should not be hasty here! I meant "need to be reverted" in a more general sense.

Gotcha, moving on...

> That said, it's not the case that unordered memset/memcpy should have semantics defined to as "whatever's useful to reames". :) They need well-defined semantics at the IR level, compatible with other atomics' semantics.

I agree.

> And they have them! -- they are intended to have the same behavior as if you had written a series of <len> / <elementsize> unordered atomic load/store instructions with size <elementsize>.
> 
> So, then, the question is: what is safe to do with atomic unordered instructions? Shall we merge these two instructions into one 32-bit store instruction?
> 
>   store atomic i16 4, i16* getelementptr ([2 x i16], [2 x i16]* @Z, i32 0, i32 0) unordered, align 4
>   store atomic i16 3, i16* getelementptr ([2 x i16], [2 x i16]* @Z, i32 0, i32 1) unordered, align 2
>    
> 
> We currently do not. On X86, for unordered atomics of those sizes and alignments, I'm pretty sure the above would in fact be a safe and beneficial optimization.

I would argument that we can, and possibly should.  This is definitely legal on x86.  Profitability is likely, but there might be some cornercase so I'll save that discussion for the patch review if someone wants to implement it!

> Similarly, merging two 32-bit stores into a (properly 8-byte-aligned) 64-bit store should also be safe -- but only on the Pentium and later. Otherwise, that's be an incorrect transformation, because the 8-byte atomic op ends up needing to do a libcall, which may not be atomic w.r.t. the 4-byte atomic read/write machine instruction.

I agree with your facts.  This is one of the arguments for me that some of this will have to be target defined and controlled by target hooks.

> My proposal is that at the IR level, we should document that it's unsafe to mix and match sizes on atomic operations, and behave as such in all our IR optimization passes. But, that doesn't prohibit teaching SelectionDAG load/store merging to do the above optimization in target configurations where it's known to be safe to do so.

I think what we're debating here is the meaning of "unsafe".  If "unsafe" is interpreted as "has target defined behaviour", then I completely agree.  If "unsafe" is defined as "result is undefined" (in the UB sense), then I strongly disagree.

Do you have some hesitancy about this being target controlled for some reason?  We certainly have precedent for a number of target legality hooks in optimizations already.  Is there a concern I'm missing here?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56534/new/

https://reviews.llvm.org/D56534