<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/65121>65121</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [PPC64] Sequentially Consistent Load allows Reordering of Stores
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:PowerPC
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          lukeg101
      </td>
    </tr>
</table>

<pre>
    Consider the following litmus test that captures bad behaviour:

```
C test

{ *x = 0; *y = 0; } // fixed initial state where x and y are 0

// Concurrent threads
P0 (atomic_int* y,atomic_int* x) {
 atomic_store_explicit(x,2,memory_order_relaxed);
 atomic_thread_fence(memory_order_consume);
 atomic_store_explicit(y,1,memory_order_seq_cst);
}

P1 (atomic_int* y,atomic_int* x) {
  int r0 = atomic_load_explicit(y,memory_order_seq_cst);
 atomic_thread_fence(memory_order_relaxed);
 atomic_store_explicit(x,1,memory_order_relaxed);
}

exists ([x]=2 /\ 1:r0=1)

```
where 'P0:r0 = 0' means thread P0, local variable r0 has value 0. [x] means x is a global.

When simulating this test under the C/C++ model from its initial state, the outcome of execution in the exists clause is forbidden by the source model. The allowed outcomes are:
```
{ P1:r0=0; [x]=1; }
{ P1:r0=0; [x]=2; }
{ P1:r0=1; [x]=1; }
```
When compiling to target PPC64le with clang trunk `-c -g -O1 -pthread --std=c11 -ffreestanding -mabi=elfv1` (https://godbolt.org/z/xj9W7nr9G), the compiled program has the following outcomes when simulated under the PPC model:
```
{ P1:r0=0; [x]=1; } 
{ P1:r0=0; [x]=2; } 
{ P1:r0=1; [x]=1; }    
{ P1:r0=1; [x]=2; } <--- forbidden by source model, bug!
```
which occurs since the `b L0x50; isync` pattern allows the memory effect of the load of y on thread P1 to be reordered after the store to x on P1. We observe the execution:

{ P1:r0=0;[x]=0} -> P1:W[x]=1 -> P0:W[x]=2 -> P0:W[y]=1 -> P1:R[y]=0 -> { P1:r0=1; [x]=2; } 

where the local variable `r0` is stored in register r9 in the compiled code.

The problem is isync forces the instructions to finish, but not their memory effects. Hence the forbidden outcome { P1:r0=1; [x]=2; } is allowed under the ppc model.

```
PPC test

{ [P1_r0]=0;[x]=0;[y]=0;
  uint64_t %P0_x=x; uint64_t %P0_y=y;
  uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x;
  uint64_t %P1_y=y }
(*****************************************************************)
(* the Telechat toolsuite                        *)
(* *)
(* Luke Geeson, University College London, UK.                   *)
(* *)
(*****************************************************************)
 P0                 |  P1                    ;
   li r10,2          | sync                 ;
   stw r10,0(%P0_x)  |   lwz r9,0(%P1_y)      ;
 lwsync            |   cmpw r9,r9           ;
   sync              |   b L0x50            ;
   li r10,1          |  L0x50: isync          ;
 stw r10,0(%P0_y)  |   li r8,1              ;
                     |   stw r8,0(%P1_x)      ;
                     |   stw r9,0(%P1_P1_r0) ;

exists ([x]=2 /\ P1_r0=1)  // YES under PPC model

```
We have observed this behaviour with several hundred Message Passing and 'S' pattern tests (the above is an S pattern test).

To fix this, we advise replacing the isync with lwfence to forbid the buggy behaviour:

```
PPC test-fixed

{ [P1_r0]=0;[x]=0;[y]=0;
 uint64_t %P0_x=x; uint64_t %P0_y=y;
  uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x;
  uint64_t %P1_y=y }
(*****************************************************************)
(* the Telechat toolsuite                        *)
(* *)
(* Luke Geeson, University College London, UK.                   *)
(* *)
(*****************************************************************)
 P0                 |  P1                    ;
   li r10,2          | sync                 ;
   stw r10,0(%P0_x)  |   lwz r9,0(%P1_y)      ;
 lwsync            |   cmpw r9,r9           ;
   sync              |   b L0x50            ;
   li r10,1          |  L0x50: lwfence        ;
 stw r10,0(%P0_y)  |   li r8,1              ;
                     |   stw r8,0(%P1_x)      ;
                     |   stw r9,0(%P1_P1_r0) ;


exists ([x]=2 /\ P1_r0=1)  // NO under PPC model

```
which no longer allows the buggy outcome under the ppc model:
```
{ P1:r0=0; [x]=1; }                           
{ P1:r0=0; [x]=2; } 
{ P1:r0=1; [x]=1; } 
```

I'm happy to discuss this as needed.

I have reported the same bug in GCC.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWV9v4zYS_zT0y8CGRMX_HvyQyM1ecXtXo9tDcE8GJY0kdinSJSlb7qc_kJQTy7Gz6e4Vhx42MJKI5AyHv5n5DUdmxvBKIq7I9IFM1yPW2lrplWg_YxVH8ShTxXGVKml4gRpsjVAqIdSBywoEt01rwKKxYGtmIWc722o0kLECMqzZnqtWk-SeRGsSnX7Pov7jH1MvP1gwfwBC7zsgyRoikvin49nTfA2EPhL6CCXvsAAuueVMgLHMIhxq1AgdMFnAEZhGiAbag2SqZN5qjdKZrpEVJkxvIiB0waxqeL7l0hJ6D0dC0-FIR-gSyPwhyEA_aazSuMVuJ3jOLaGLjtCUEpo22Ch93CpdoN5qFKzDgtAlSS4UBEu2JcocCV0MxHIlTdvgNbFX-zqD48t9Df62zY09V0Dm63NoNvFXnB24tKAj755-oVCsuDTnS6a8B4I3kLsK_SsIrii4gAA7bqxxMJDpQ0ema5KsqY-2aQoxSe51RJJ17DTcDukQgYTON5GX6EOXzqFBJk0fcLCJCE1BqJwJ2DPNWSbQIVkzA3smWoRoAr0ZvWQH3ACDSqiMicm5CU81SjC8aQWzLjdtzfvMbOUpc1NCH1NCHwh9gEYVKKDUqgFuzTCFnFluvWptrhoEVQJ2mLeWKwlc-rkeqFyw1qCzqlQ640WBErKjX2FUq3MMG03glxqBOeLA4qTXuOR8IYchho4ENs-Ah7x_9kjc08A7ltI3lsZvah3a4_HNVbPjwsOrwDJdoYXNJp3dCYQDt7WDw03qVn4GMovGOYwrGP8Uw3jXO308NrYgyTqPYxiXpUY0lsnC6Rw3LOMkWaMo9zGZeSKqrd0Zh5FnrUoVmRJ2onRF6OPvhD52vy6f5lIvP7iI7N0WrMQCdlpVmjU-oIbM_eyBw1nYYHEWK5tNGlz3TQ6CP-Chq2tvuAgA3rP8RXWSjsfjYYyex6eDLmsrQuMbCc3zGlSet9qA4TJHjxGZRRl8jLqpPxE3R5k7t-2YtahlCPeAfGAhwLLE3Lp8coOOJt3_R1DymRNiF1oZgkZPWVgAK23vE09ybr5zEpt4Ak8IKjOo99gnZZ-llyX3Ff4vGEUOnzFJfggrns7A7oej4TC9GD4OVzslP78MR2H4nW46szmQaMBpwJBkFmnnGkc6HhF3AQCNFTcOKL08UdRzHuSqwAFZOjLaaZUJbJwW7zkXHDkGd3FprG5zh6RxeJdcclOHILEglbszINdDt5oJ_A1PofESaScSfR8Cjt57lnzJxd0u72n0jaLjEvbqTWr6sIm3Ojp5e-h7_3Q8e-qLesulnd1tLRA63UTbjiTrztl4MX4kyfp4Syzeho2Tdf_34WL6pPWGtFd-Rsh0Qej9X_-zPD-O9-8vKDB392erlDAttwg3fq4ouDL0sf2M8AHRKOli9l-S71Ebbo-QKiGwQvioZNFP_n3ytfv89T_9cWATvUZgnoJj42vgvEQsCA46dvc4OhT1jPKWoLGHXjLyaIYso8t-ZxCH30Evz6ZdPrjpoSJxuNwpiOfN7hDk9fKWBa9MDKJ9SfviieMLwb4Q3vd0ekX42pmP52fmoBdDzZe7X_GGF_WqFwO4uitwfUHBEO_AWq7ZeW4WvtwjnCgvDucKzea_f_jU0_nLteo2kz8h1Gz_XNmLcJd_bqfDTdPgHjUTULeycCXwH2gMqxA2rqGXle9-CZ1_cl3H6ULiyoO325EOy9TeX9yZhE-DJYQuh9XSFcDOW-EY44DAij037o6yEywP3Qb2bvfGiUMZKqHqC6FfkLVVdXzva4FTORv7Fv-_UtS-17Q_m0i_17T__ed7TbvC7t9c006M9kr4_6SmfW1l--dPf6CwhTZaKhBKVqjPG-RQG0690pXW59teQ9z--bNeUFw31v_-kdB5AzXb7Y6uQhbc5K0xocozAxKxwGJQgX8MFwKNO6UthmpqWONhc93uhzSdjIpVUiyTJRvhKp4tkzs6Xy6mo3qVRIzGZTKncYIlzpDR2aKIWTHNsizO8uWIr2hEk2iRRDRKlnfTST6N6YzF5QIR5_M7JHcRNoyLiRD7ZqJ0NeLGtLiaTWMajwTLUBj__QGlGcs_oyxIcr9RB9SblFBKpuuRXjnZcdZWhtxFwoXZizbLrfBfQPh3aWS6hk_4W4vSciaEI3hpXG8vLXxUrDiFzc_hDYl_nVXCJ6s0mlGrxerinRm3dZtNctUQ-ui27P-Md1r9irkl9NGfxhD66A_0nwAAAP__Reybug">