<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/62652>62652</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [AArch64]: 128-bit seq_cst load can be reordered before prior RMW operations under LSE and above.
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          lukeg101
      </td>
    </tr>
</table>

<pre>
    After chatting with @Wilco1 I generated some concurrent litmus tests to test the compilation of concurrent 128-bit C/C++ programs using Telechat [1]. Consider the following litmus test (https://godbolt.org/z/hEcsc77oe):
```
// globals 
__int128_t* P0_r0;
__int128_t* P1_r0;
_Atomic __int128_t* x;
_Atomic __int128_t* y;

// Test body
void P0 () {
 atomic_fetch_add_explicit(x,1,memory_order_seq_cst);
  __int128_t r0 = atomic_load_explicit(y,memory_order_seq_cst);
  *P0_r0 = r0;
}

void P1 () {
 atomic_store_explicit(y,1,memory_order_seq_cst);
  __int128_t r0 = atomic_load_explicit(x,memory_order_seq_cst);
    *P1_r0 = r0;
}

// exists (0:r0=0 /\ 1:r0=0) <-- forbidden by rc11 memory model
```
If simulated using the rc11 memory model we get the following outcomes when run from an initial state where `x,y,P0_r0, P1_r0` are zero initialised:
```
{ 0:r0=0; 1:r0=1; }
{ 0:r0=1; 1:r0=0; }
{ 0:r0=1; 1:r0=1; }
```
When compiled using `clang -O3 -pthread -std=c11 -march=armv8.4-a` the 128-bit sequentially consistent (SC) load emits ` ldp; dmb ish` and the SC store is implemented as a compare and swap loop using `ldaxp,stlxp`. `ldp` has no ordering constraint and can be reordered before the `stlxp`, allowing an outcome when the AArch64 program is simulated under the AArch64 memory model, that is forbidden forbidden by the source program under the source model:
```
{ 0:r0=0; 1:r0=0; } <- forbidden by rc11, a bug!
{ 0:r0=0; 1:r0=1; }
{ 0:r0=1; 1:r0=0; }
{ 0:r0=1; 1:r0=1; }
```
As far as I can tell, this affects any mix of LSE{2} (and above) uses of LDP, with the compare and swap loop. I have observed this on clang versions 13,14,15, (albeit with a different implementation using `sync`), but not clang 16 as the `caspal` instruction has acquire release semantics and does not allow the forbidden outcome.

[1] https://www.youtube.com/watch?v=xn4jtXOGfKg

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzMVl2P27gO_TXKC5HAlhPHechDPpqLwe1FizsFum-BLNOxurLkSnI--usXlJOZZNqiXWAfFhhMEok64iEPKQrv1cEgLtlszWbbkehDY91S93_iIU3SUWmry3JVB3QgGxGCMgc4qdAAmyaflZY2hSc4oEEnAlbgbYsgrZG9c2gCaBXa3kNAHzwEG79AaMim7ZQWQVkDtr4_kvJiXKoAG8Z3G8bXjK-hc_bgROuh9-TAJ9RI3gCbrVM2205gY41XFbqIXVut7YkM764HxosmhM6zbMX4jvHdwVal1WFi3YHx3TfGd8076eV8bpHxBZklW5asWJ5c_4af8SwctC2F9jAs7vfKhJQX-8D4Cj4me5ewbP3DvfRhbxVsqyQ82px_sX952X9w6ROxpHwNq0erKviYEG_GF8Dm1zMgIua-xiCbvaiqPZ47raQKjBdnxjcp45sWW-sue-sqdHuPX_fShxiUG8idS-ASYNn2hquteIC8_BYc46sYtoh0FyE2395THUilPyXlg3X49vZ_mND5t-AGSumvKV2zh2dFNcJ4kbBsReZbyt2OzTaQvqxE0tlmPIbaulJVFRooL-BkmsLgE7S2Qv1D6T7V4FXb61ipQyVRuXx3GE4IBwxvasn2QdoWPZwaNOB6A7WzLQgDyqighAYfREDadggsTyhOFP6hHPjmqv08AeEQvqGzt5PKY_XTepuv4TUiLFu_RiOlX6_RvDdMHwyT3zV8g_joyWfiPfStl_ixPJFamAOMP2Qw7kLjUFQw9qFi2ZbiOm6Fkw3LtsK1x2IyHQsKAEX21uc8fu3RUBj0hfqgVz5QH2S8eN5Qvkl_gK0ideQJ6KojL6u2BOWbGE5TRcTnDUT9g_Kg2k5ji4ZSLTyI6DjFnYz9SXSgre1eWehKnDvGNz7oc8fyZDIs0ldohAdjIYqdzMnJ4IQyIaJJYaBEcBgNsIISa_KCXGJ5ckMkBYibmIS56WmQE9muVk42-fTW7InFnVzNrbvfzB7kzjcQ6EFQ_q4wHkqEjnrbO4kvF7xiXjcGrL8vxJu-Yml-X5mROZT9gfH03ybqlYdaOJLIU0xkQH2NpvIg6hpl8CDMBVp1pof6_fM7Nl_zSJYXlH5R2iM9mNB79NFk-5EQ4pRwe-q_U94EnqARRwRbenRHrIYbrYGhno7ovLLGQ5pRB5_SvxnB0q26RBWGCwRUqq4xTg4vmh_Gihdt-4uRUYALAij7AMaG6z1pTtyvUpXCd0KT5BUpvJcRh-Qv5NdeORK5RuERPLbCBCV9ZFVZ9BEz6vvaN28iuOp88tDzh7kFHgeS0-k0udg-9CVOpG1pRQRqHrsjy7ZnM_0S_vjwn_q_hwFkVC2zapEtxAiXaV5k88VsnhajZom8kmU2zfmMz8VCFrIs0qKsk3mWzoTAfKSWPOFZMkvTZJGks2KCYr7IcJGlOBdCTlM2TbAVSk-0PrY0II2U9z0uc57P-EiLErWPEyPnBk8QNxnnNEC6JZ0Zl_3Bs2mi6VF7RQkq6DhqXouYzbYsW923QnpIh473s7bSOWUd_P9_n8F2NHVGlQyl_P75HbxIcjLqnV6-GflUaPryGlzy6vox7pz9gjIwvotcPOO7yPWvAAAA___qUWZv">