<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/101551>101551</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Unsound idempotent rmw to load optimization in x86 and most likely other backends using lowerIdempotentRMWIntoFencedLoad
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          gonzalobg
      </td>
    </tr>
</table>

<pre>
    The `lowerIdempotentRMWIntoFencedLoad` seems to allow a backend to implement the LLVM IR optimization that was shown to be unsound here https://github.com/llvm/llvm-project/issues/56450#issuecomment-1183759339 . 

The x86 backend (among other) performs it in `X86ISelLowering` lowering, but the bug above shown that the illegal outcomes not allowed by x86 and Rust were actually produced at runtime with this optimization on x86 (and most other backends probably).

I think we should remove `lowerIdempotentRMWIntoFencedLoad` from LLVM. It is an optimization that feels like it should work, but does not. 

I ran into it while fixing something else. Here is a repro https://clang.godbolt.org/z/s54cGqozK that shows this code: 

```c++
int thread0() {
  y.store(1, rlx);
 atomic_thread_fence(rel);
  x.fetch_add(0, rlx);
 atomic_thread_fence(acq);
  int r0 = z.load(rlx);
  return r0;
}

int thread1() {
  z.store(1, rlx);
 atomic_thread_fence(rel);
  x.fetch_add(0, rlx);
 atomic_thread_fence(acq);
  int r1 = y.load(rlx);
  return r1;
}
```

which is guaranteed to not exhibit the `r0 == 0 && r1 == 0` outcome because one of the rmw reads from the other and the rmw build release and acquire pattern with the fences around them, is compiled into store buffering: 

```asm
thread0():
        mov     dword ptr [rip + y], 1
        mov     eax, dword ptr [rip + z]
 ret

thread1():
        mov     dword ptr [rip + z], 1
 mov     eax, dword ptr [rip + y]
        ret
```

because:
- the idempotent rmw are optimized into loads (incorrect)
- the loads are dead, so they are removed (the generated code doesn't even access `x`...)

When piping that code into herd with x86 TSO: 

```
X86 SB
{
}
 P0          | P1          ;
 MOV [y],$1  | MOV [z],$1  ;
 MOV EAX,[z] | MOV EAX,[y] ;
exists
(0:EAX=0 /\ 1:EAX=0)
```

it confirms that the outcome `r0 == 0 && r1 == 0` can be produced, that is, that the produced binary produces an output that the input C++ code forbids.

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzMVltv2zoS_jX0y6CGRF0sP_ghl_VusClatN22bwUljSVuKFIlqfjy6w-GspQ4J0VzztMRDNsacoYz31z4CedkoxE3LLtm2e1CDL41dtMYfRLKlM2iNPVx86VFYHmkzB7tXY1dbzxq_-n9tzvtzRZ1hfW9ETXLI3CInQNvQChl9iCgFNUD6ppEsusVdqg9-Bbh_v7re7j7BKb3spMn4aXR4FvhYS8cuNbsNSmVCIN2ZtA1tGgRWu97x5IrxreMbxvp26FcVqZjfKvU4_Tzrrfm_1h5xrfSuQEd49ssT7OI8SQIKtORJ-_iuEhW2TpJ1rAEFt2y6Gr8ppgPRT77z3ghOqMbML5Fy_gaerQ7YzsH0oPUBND3Ir_7jOqecJK6ITzU9J_fQDmMkZdDA6I0jzhFSUHTglQKG6HADL4yHTrQxo9AYg3lMfgjdA2fBudhT2iIyg9CqSP01tRDhTUID3bQXnYIe-lb8K10lxgbHQxRQLqGzjg_xjSF6shYKUp1ZHy9fI7JHVnTD7APng-qBosdxfHG4thZ04W8L-HOg3Qg9Cvp3yEqB0o-ICF7Pmhv7MOEYW1GaC4TdgdWaJCaCs3DvpUKYScPUjfgTIfkeQOoHC7hPwQdHQ8We2te1FSlhG6WjalLo_zS2Ibx7YnxrcvS6t8_zem_o5uUOzfCW5kaWXJ14Q7Lo_FTMX5NnyCVofgtijpivKAqYqvzEsBx6byxyHgRU6hWHRhfs2RaF950svoxqv_YEbKMFxbVxS44LHfoq_aHqGvGi-jtlkT189ISOWsjYMktnJaKcsiLl6bAoh-sBhvNMra6fQ7EU8jxn0M-_RNDjkPIx9-GHL8S8pT15wjsW1m1VG_NIKzQHjGMQ2puPLSylGP3szwa0abTI2A8Zzw_exNE1ELn0QAlVmJwCEYjmF3Qt90eKDo39hmJxr6mNp82lIMMbatQOAwrovo5SIvQC-_R6mlqIASQHAgbhq9vsSNgQ7l3vVRYj80WEgjlsNuNg-5XfSBcN0ou6p-67gzr-HTmMfzWe2Nr6L0Fll1b2QPj13Bk2S05Eb-ug-JAq6-qnkh11LLon3t4UZx_zZ_TC3_e4sjxyZHz8-TPa9VzzvTs2LvxqpgnbUirsDhN0ikvVL2OprzUlbE2XIbr5ybGDaRZIxX6DThDC8cgGyd7uPdoc4MarfBYh2EXRrBmfOUBH1GDqCp0jir4wPJouVzOR43f31rU0MueRnAYnsFIcLNFW481R5fSl88ffllA4-v3IofPc9u97D_4GMH8sNUNfIyfvc8t_P7DV0rJuZwYT-Nx91l-upBfaP3r6jutjHtmnVl6DNJJAw_SeXd2kQZTckU7k1vq7i3LqHBm0RNkr1WBJMz0ThLfmPnCNAzeNDkqoYlMTUSB8h0MSTf_JZszkSilFnYmFuNdPfg-cJiJr2h6vRkvuDGnO2NLWbszbVjUm6ReJ2uxwE284jxL42ydLNpNWedFnsZpkWZZyjHLRJamnKclT0Ue58VCbnjE06iI4nid5kmyXMUVFiWvMsQ8inDF0gg7IdWS6B7d0ovA6zZxFGdZvFCiROUCreVc4x7CKuOcWK7dBI5YDo1jaaQoSU9mvPQKN_87c84XfXbuq0vWIvXMzAKbIuqiji9J1eCo-n9HkxaDVZu_z3HP0T9u-B8BAAD__1HAm6M">