<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/59632>59632</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            clang CUDA: shared variables sometimes are not updated (on Pascal architecture)
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          AuroraPerego
      </td>
    </tr>
</table>

<pre>
    Shared variables sometimes seem not to be updated (unless other operations are done on them forcing their reading). 
**This behaviour has been observed on a NVIDIA GeForce GTX 1080 Ti (sm_61) and on a Tesla P100 (sm_60), but not on newer GPUs. (Older GPUs have not been tested).**

To reproduce (the github repo is [here](https://github.com/AuroraPerego/clangSharedVar)):
```bash
git clone https://github.com/AuroraPerego/clangSharedVar
cd clangSharedVar
make environment
source env.sh
make
```
This will trigger an assert. Now go to `radiSort.h` and uncomment lines 181-182 and 206-207. 
After this change do:
```bash
make clean
make
./test
```
This will compile and run fine (and print a bunch of numbers).

After playing around a bit we found out that there are other ways to obtain correct results:
- replace the `__syncthreads()` at lines 180-205 with `__threadfence_block()`
- remove the `__syncthreads()` at lines 180-205 
- do an `AtomicAdd(&sharedVar,0)`

The shared variables involved are `ibs` and `p`. Since only the first thread writes on them, it seems that the other threads don't understand that the values have changed and don't read them.

clang version: clang++ 12 to 15 have been used
CUDA version: 11.5 

cc: @fwyzard
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJycVU2P4jgT_jXmUiJyDAQ4cKCnxWgu87Y0PaP31nLsIvGOY6OyA2J__aoc6O7tXc1qV4oCrtTHU1VPlXVKrguIO7F6EKvHmR5zH2m3HymSfkLCLs7aaK-7b70mtHDW5HTrMUGKA2Y38D_EAULMkCO0COPJ6owWhNqMwWNKEHOPBPGEpLOLIYEmBBsDQgyQexzgGMm40PHBERBq60In1LYCIR-F3AvFz3PvErTY67OLI0Gv-YQBYpuQzmjZnYavP748ftnDZzxEMgifn_8PtdxIeHYMKQ0vTS3UFnS46T9j8hqeainv36VQW6E-QTvmklcMEPCCBJ-fvqeKtf7n7e0IvT5j0SpQMqaMlpFPkG_wy_s5AuGJoh0NspPcI3Qu92PL8ggugVg99EgoVo9CbfqcT0ks9kIdhDpMmpWJg1CH9_0R6mC8Dt3UoR-aCvotG07BGzk9rU79JOpcBuO5Af85RvFjLPydeNA_ETCcHcUwYMiTNMWR-4HhXN1xsOIHkLdScacvznvI5LoOCXQAnRJSruBrvEAXmW2ikaSt-xYpV71oZGnqGEwcOCx4FzBBvann9UaVb0o2cyXXd1rtjxkJMgczvQ4ds_JXdSuJGY86fMRfCXXg3v9DNiYOJ-exYKExwNGFQgU-n8iFDBraMZge4hHCOLRIqZDpHY0m0CevrzwxmuIYLJu5DBeEYznGMUPuNb-QsMzbNIQXfU1cudhm7QKYSIQmA2EafU6vuc-ZkV4bZAdc5peXdA0m9zyaSagNE4zr_VZkOVdyBReX-0l_0j1iMPjS-mh-vlq9hRji-V9HuFvbyJwQjdznODizt7YYNeltDD7J9wHvvUBIH3eZC-foeYFwoUQjXZvubBKNPIlGVvDNBcMLy18L4qOjxOVluHAhlzHdtxmvDpfLWkyvXbjV_5Yfbz-h1hnGYJFS5kivmmftR7wtlomWtkC525SQHOhPtCiDCGek5GIQi_00mUI9CPUAteKm16vJadlUY0I7WX76_rh_b1jX1b3MN9eGxWIpj5fr75rszO4WdrvY6hnu6mZdN7VcLFezfmfXplkq2azaVspmu8YGj-1iuVArrJfr9WLmdkoqVStVy02t1Kpartfr5Wat1dYa1AsUS4mDdr7y_jxUkbqZS2nE3WrbLNTM6xZ9KneVUgEvUD4Kpfjqoh3bzNuxS2IpvUs5vXnJLnvcTTXifDmhv9Dg7UpjHvBOf3eXxQBPOhntQZPpXUaTR0KhtrOR_O4Xe5Qx3H7mJ4q_oclCHQryJNShZPZHAAAA__9SZ2QC">