<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/122760>122760</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[NVPTX] atomicrmw on <4 x float> relies on __atomic_compare_exchange_16
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:NVPTX
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
Artem-B
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Artem-B
</td>
</tr>
</table>
<pre>
NVPTX currently lowers atomixrmw on `<4 x float>` as a call to `__atomic_compare_exchange_16` which does not exist on the GPU:
https://godbolt.org/z/ovf4cqKK5
Newer GPUs do have support for vectorized atomic ops on some data types, but on the older GPUs they must be lowered without relying on runtime.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJx8UkuP0zAQ_jXOZdTKcV7bQw5ll3BYabUHQNwqxx43BicO9qSP_fXIbRYEB04eWfPNzPeQMdrjhNiy6gMTYh8Ix02qWPWUyYUGH9r1M-u9vrYvX18_fwO1hIATuSs4f8YQQZIf7SWMZ_ATsJqz4rGECxjnJbHiI6s5yAgSlHQOyKeWw-EGUgflx1kGPOBFDXI64iGvU_95sGoA7THC5AnwYiOl6TQgfHr9woo94_uBaI6pFB0T3dHr3jva-nBkontjovMnU6qfz88V46n9Bc8YEjqC9jDIE0Jc5tkHAuMDnFCRD_YN9Z2QAj_HtDP6EUFLkkDXGSMTj9Avv6_xTr9PpQGvMC6RoMe7NqjhbGnwC0FAd7XTMcHCMpEdcXs_K9NtoXfFTmbY5k1RNztR70Q2tLzpua7RCPNQaWlKI4oGJa-52hk0vMxsK7ioeJ4X-YMoK7HNS2Nk3XA0fYNFYVjJcZTWbZ07jUmYzMa4YJsL0dQ8c7JHF1f3e6l-4KRZsb-5vKYgtAm66ZdjZCV3NlL8M4wsuVt47ojqaRXuPQl_xyApYPGm6P_cz5bg2n-ctTQs_Vb5kYkubV-fzRz8d1TERHfjFZnoVmqnVvwKAAD__yUT8LI">