<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/65541>65541</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Armv7-a]: Sequentially Consistent Load Allows Reordering of Prior Store when Implementations are Mixed
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          lukeg101
      </td>
    </tr>
</table>

<pre>
    Consider the following litmus test that has buggy behaviour:
```
C test

{ x = 0; y = 0 }

P0 (atomic_int *x, atomic_int *y) {
 atomic_fetch_add_explicit(x,1,memory_order_seq_cst);
  int32_t r0 = atomic_load_explicit(y,memory_order_seq_cst);
}

void P1 (atomic_int *x, atomic_int *y) {
 atomic_store_explicit(y,1,memory_order_seq_cst);
  int32_t r0 = atomic_load_explicit(x,memory_order_seq_cst);
}

exists 0:r0 = 0 /\ 1:r0 = 0
```
where 'P0:r0 = 0' means thread P0, local variable r0 has value 0

When simulating this test under the C/C++ model from its initial state, the outcome of execution in the exists clause is forbidden by the source model. The allowed outcomes are:
```
{ P0:r0=0; P1:r0=1; }
{ P0:r0=1; P1:r0=0; }
{ P0:r0=1; P1:r0=1; }
```
When compiling P1, to target armv8-a cortex-a15 (to `dmb;str;dmb;ldr`) using clang trunk (https://godbolt.org/z/v6cso7zq5), compiling the load of y on P0 to target armv7-a cortex-a15 (`ldr;dmb`) using clang trunk, and compiling the fetch_add of x on P0 for armv8-a cortex-m55 (also `ldrex;strex;cbnz` loop). The compiled program has the following outcomes when simulated under the aarch32 model:
```
{ P0:R3=0; P1:R0=0; }     <--- forbidden by source, bug!
{ P0:R3=0; P1:R0=1; }
{ P0:R3=1; P1:R0=0; }
{ P0:R3=1; P1:R0=1; }
```
which occurs since the sequentially consistent atomic load of y on P0 is compiled to `ldr;dmb` which does not have a leading dmb fence, and can thus be reordered before the STLEX on P0. Allowing the buggy outcome forbidden by the source.
```
ARM test

{ int x = 0; int y = 0;
  uint32_t %P0_x=x; uint32_t %P0_y = y;
  uint32_t %P1_x=x; uint32_t %P1_y = y;
  1:R2 = 1; }

P0                       | P1;
label: ldaex r1, [%P0_x] | dmb ish;
  add r1, r1, #1         | str r2, [%P1_y] ;
  stlex r2, r1, [%P0_x]  | dmb ish;
  cbnz r2, label         | ldr r0, [%P1_x];
  ldr r3, [%P0_y]        | dmb ish;
  DMB ISH |;

exists (0:r3=0 /\ 1:r0=0)
```

To fix this, add a dmb before the ldr when compiling that load: `dmb;ldr;dmb` to forbid the buggy behaviour:
```
ARM test

{ int x = 0; int y = 0;
  uint32_t %P0_x=x; uint32_t %P0_y = y;
  uint32_t %P1_x=x; uint32_t %P1_y = y;
 1:R2 = 1; }

P0                       | P1;
label: ldaex r1, [%P0_x] | dmb ish;
  add r1, r1, #1         | str r2, [%P1_y] ;
  stlex r2, r1, [%P0_x]  | dmb ish;
  cbnz r2, label         | ldr r0, [%P1_x];
  DMB ISH                | ;
  ldr r3, [%P0_y]        | dmb ish;
  DMB ISH | ;

exists (0:r3=0 /\ 1:r0=0)
```
which no longer allows the buggy outcome under the aarch32 model:
```
{ P0:R3=0; P1:R0=1; }
{ P0:R3=1; P1:R0=0; }
{ P0:R3=1; P1:R0=1; }
```

This bug would not have been caught in normal execution, but only when multiple implementations are mixed together.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsV1-P4jgS_zTmpdQocUiABx74c-hGmpFQ90h3b8iJK4lvnJixHRrm05_shCZhYKd3t6XVSotQILHrj6t-v6oKM0YUNeKCxCsSb0assaXSC9l8wyIMwlGq-HmxVrURHDXYEiFXUqpXURcgha0aAxaNBVsyCyUzkDZFcYYUS3YUqtEkWpJgQ4IlSYLu62_XXqxbaq_TFZyARBsISLSCc_sXyHTT37ULgNAZs6oS2V7UFghdnghdw_DRmdA5kOmqFbos5mizcs843-PpIEUmLKEzJx0Suq6wUvq8V5qj3hv8vs-MJXROoosSELWN6N6CDrxvnVKp2EDf-R26bg51VILDLvwTJzNWabz14iNPdfr9p8KTMNa4bC473S53WxKvIew9uwuP1xI1AqHTXV-c0ClUyGoDttTIOOwCFyCpMibhyLRgqUR3EAfEI5MNvqn31_-UWIMRVSOZdQC2pejg29QXeK8J3a4JXRG6gkpxlJBrVYGwBkQtrGASjGUWnWG3XzU2UxWCygFPmDVWqBpE7de6CGSSNQZBGMiVTgXnWEN69juManSGraExfC0RmGMX8oteA0zjIxI5ynQBItHG02YXXm5Dd3tNyWBrONwavH_rjdahPz6-maoOQrrw7kIfJAWW6QItMF0dZ08MMqUtnp5YGDvEWwUkCXiVkmhlrCbRqv0vuXaK6Rwa47RlkrmU6ab-5sRKaw_GBYZuCd0WiqdK2rHSBaHbH4Ruj0lm1PTH99ghlK57brm4O3i7lJ1B1bALbpyc_uQkSQLnT-vbA688V2t-Y-mt5jhzp85crvRtNKrYG2LS-HhIrvHURsT_Zmn9gyQBSKUOhM5brLSWkMNBq0KzyuN-WKPfYPTawz7yHuAZ01kZ0RaDv0TaczRA2nMfPuA-JFo_PT0Nkd6i3MUnbQpCw3fofIBevzW8b_4dW38Tva-lyEpQWdZoA0bUGbYUxe8N1o748gyZ64TGYm27QvkTkoS5psVeUvkGHGiNcIUGauU65hGBgUTGXbZ4lUKOdRsrjyXmKkljIEXQ6CsvckgxV7r17uXr53_9tzU9huUl626lbcSX8vSg8ozvRmL5_OVue3YtqNei3e357fbST5pLQyE03gX7E4k2DsC3z1vB8yPB8IFgeEfQp5f6xzcJvowM9z9kuvbY6BRJlnoCgOQMT6B99XJjUXeOeOMlXI6EKXv2Hbnb3Z0MjcKBEWM1aNpTF-7PXt1Vh7HS2aQ9LQPLD0y7stBJefcHZiXXoIOBWaerJ-53RANr3rGekp9tbr6s4NPLv93qtfX3Oz6hM98-PK2HLd_Tlc7v1xh__aogFyffmj0JOAfmneiB3rn9Ouw1fv50XHQJfOsnA-pZ1bGgR49fzql_Lyr8w4Q_yoQLqO8E5gP5Ah9KmLaX1AqkqgvU7eho7hT_j2z2f0Fj7gqDG9bTpoBX1Uh-7Z4pukLAmqK0bu6ula6YvM7i7dRhQdXy3BaNqpFWHCSCqA4SK6wtcxv9pA2VOPnGXaAtUY9HfBHxeTRnI1yEyXwSRLM4SUblgod8GvMozGkcBJMwiCazLImTSRDEfJbnfCQWNKBRMA8SSsMkSMaTGU94modhRHEaBTMyCbBiQo6lPFZucB0JYxpcJHE8CUcew8a_lVNa4yv4RUKpe0nXCyfzlDaFIZNAOgxdtVhhpX-dX7aTrMf5El76Y8z6OsZ8dvPLskXOczth-Lkxh50WSsOLe7NsA_fpTry-uHiNGi0XNwO5sGWTjjNVEbp1vnU_Twet_oeZJXTrT2QI3foT_z8AAP__2UOyog">