[PATCH] D15352: [EarlyCSE] DSE of atomic unordered stores

Wed Dec 9 10:01:03 PST 2015

reames added a comment.

Responding to JF.

================
Comment at: test/Transforms/EarlyCSE/atomics.ll:131
@@ +130,3 @@
+  store atomic i32 3, i32* %P1 unordered, align 4
+  store i32 0, i32* %P1, align 4
+  ret void
----------------
jfb wrote:
> I think the store you keep still needs to be atomic, i.e. you're "sinking" the atomicity to all post-dominating stores. Let's ignore that this is `unordered` (silly example) and imagine it's a `store release`: in that case the writer could have been writing a non-zero value, then racing with itself and writing another non-zero value to the same location. The reader observes the transition to non-zero (yes it's racy *which* non-zero is seen, but one is seen). If you drop `release` then the algorithm becomes incorrect.
> 
> ```
> void writer(Payload p, atomic<int> *flag) {
>   p->update();
>   flag->write(1, release);
>   // ...
>   *reinterpret_cast<int*>(flag) = 2; // WTF?
> }
> void reader(Payload p, atomic<int> *flag) {
>   while (!flag->load(acquire)) ;
>   p->process();
> }
> ```
> 
> In the above example `p->process()` isn't guaranteed to see the latest state from `p`. Yes the code is already UB, but we don't need to be silly.
> 
> Seen another way, replace `reinterpret_cast` with a `flag->store(2, relaxed)` and then we're well-defined, you can eliminate the `store release`, but you have to "sink" the `release` to the later store.
Two problems with this:
1) Your use of a release store is substantially different than what's being modified and tested here.  In particular, a release store implies ordering where an unordered atomic does not.  As a result, the basis for your example collapses.  (If we were implementing ordered atomics here, it would be a problem.)  -- Note I actually agree with your point that we'd have to keep the release store in the modified example.  I believe I even said that in a comment in the code.
2) The execution of the non-atomic store to a racy location is undefined.  You are *not* guaranteed that it publishes a only non-zero value.  Consider the constant value 0x01 being stored over the value 0x10.  Pretend for a moment we see bit level tearing.  It's perfectly legal for the memory to transition from 0x10 to 0x00 to 0x01.  

================
Comment at: test/Transforms/EarlyCSE/atomics.ll:159
@@ +158,3 @@
+; CHECK-LABEL: @test26
+; CHECK-NEXT: store atomic
+; CHECK-NEXT: ret
----------------
jfb wrote:
> Can you check that `3` is the value stored here? We should probably try to avoid unexpected things even when we're allowed to do the unexpected (and document that we're doing this).
Ah, good catch.  And actually, it's required to be 3.  Storing 1 would not be valid.  There could be some later well synchronized operation with a dependent read on another thread.  It's only allowed to observe '3'.  

================
Comment at: test/Transforms/EarlyCSE/atomics.ll:170
@@ +169,3 @@
+; CHECK-NEXT: store atomic
+; CHECK-DAG: release
+; CHECK-NEXT: ret
----------------
jfb wrote:
> Why not just:
> ```
> ; CHECK-LABEL: @test27
> ; CHECK-NEXT: store atomic i32 3, i32* %p1 release
> ; CHECK-NEXT: ret
> ```
> ?
Would also work.  I was trying to avoid testing irrelevant details so that if someone tweaked the test, it wouldn't spuriously fail.  e.g. I could increase the alignment or change the values.  

http://reviews.llvm.org/D15352