<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/88497>88497</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[AMDGPU] -Os can cause clang++ to crash with "fatal error: error in backend: cannot lower memory intrinsic in address space 5"
</td>
</tr>
<tr>
<th>Labels</th>
<td>
clang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
ex-rzr
</td>
</tr>
</table>
<pre>
Some benchmarks and tests from https://github.com/ROCm/rocPRIM cannot be compiled with `CMAKE_BUILD_TYPE=MinSizeRel` (which sets `-Os`), the default `CMAKE_BUILD_TYPE=Release` (`-O3`) works without issues, `-O2` and `-O1` also seem to work ok.
```
fatal error: error in backend: cannot lower memory intrinsic in address space 5
clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation)
AMD clang version 17.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-6.0.0 23483 7208e8d15fbf218deb74483ea8c549c67ca4985e)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm/llvm/bin
Configuration file: /opt/rocm-6.0.0/lib/llvm/bin/clang++.cfg
```
The test cases that cannot be compiled use large data types like `https://github.com/ROCm/rocPRIM/blob/rocm-6.0.2/test/rocprim/test_utils_custom_test_types.hpp#L128`. For example, `custom_test_array_type<int, 10>` in `test_device_scan`, `custom_test_array_type<int, 32>` in `test_device_radix_sort`.
This is intentional that registers may spill, but that's not always the case.
The fatal error is raised in `checkAddrSpaceIsValidForLibcall`: https://github.com/llvm/llvm-project/blob/release/17.x/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp#L7658
I found a couple of workarounds that helped in some cases (but they are not universal):
* adding explicit non-default copy ctor and operator= to `custom_test_array_type` (or similar types);
* passing parameters of such types to functions by reference instead of by value;
There is a reproducer of one of rocPRIM tests: `reproducer-rocrpim-scan-by-key.cpp`, it can be compiled with
```
/opt/rocm/llvm/bin/clang++ -D__HIP_PLATFORM_AMD__=1 -isystem /opt/rocm/include -Os -DNDEBUG -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -x hip --offload-arch=gfx908:xnack- -std=c++14 -o reproducer-rocrpim-scan-by-key.o -c reproducer-rocrpim-scan-by-key.cpp
```
I suppose it's not very useful for debugging as it uses a lot of code from rocPRIM so I created another, much simpler reproducer with the same behavior and hopefully the same root cause: `reproducer-standalone.cpp`.
It does not use any externals except HIP, you can build it with
```
/opt/rocm/llvm/bin/clang++ -D__HIP_PLATFORM_AMD__=1 -isystem /opt/rocm/include -Os -x hip --offload-arch=gfx908:xnack- -std=c++14 -o reproducer-standalone.o -c reproducer-standalone.cpp
```
There are 3 kernels (uncomment them):
* `kernel_bad` calls another function and passes `custom_test_array_type` by value, the compiler crashes;
* `kernel_good1` does not call any functions, can be built;
* `kernel_good2` calls another function but passes `custom_test_array_type` by reference (one of workarounds), can be built;
Also `kernel_bad` can be built with another workaround: uncomment `custom_test_array_type`'s explicit copy constructor.
[reproducer-rocrpim-scan-by-key.cpp.txt](https://github.com/llvm/llvm-project/files/14957688/reproducer-rocrpim-scan-by-key.cpp.txt)
[reproducer-standalone.cpp.txt](https://github.com/llvm/llvm-project/files/14957691/reproducer-standalone.cpp.txt)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzMV91u2zoSfhrlZiBDlmxLvvCFE9U9xjbbIG0X2CuDIkcWNxQpkFRin6dfDCXHTpqkXeAscADBiajhcH6--WbInJN7jbiK5tfRvLxivW-MXeEhtn_aq8qI4-qbaREq1LxpmX1wwLQAj847qK1pofG-c1G2jtJNlG720jd9NeGmjdLN_dcb-mMNv7vf3gJnWhsPFQI3bScVCniSvoFokdzcrv_xaXf9Y_ul3H3_992nKCtvpf4m_8R7VNEigSgtnhrJG3DoHe2Iv7pokUTpMkpvwDcIAmvWK_-OtntUyByOqsL-bNgPT4a8IktM70E616MjpUEopR3kcXibhjflDDjEFrwJm8E8TKKkjJL1-LtIxie81swzBWitsVG2Hv4BqaFi_AG1oLUxMso8oYUWW2OPILW3UjvJSZYJYdE5cB3jCPNBMVdM76P0mp6T4qCN1ik52qMWFOyWPKjZOeR4kB64EQh5CEjvEOJHcsghgtSPhjMvjab4hrPWt-Wo9xGtk0bDNJ8kk7D5IwQwgUZ_7VDfmLbrPUbpRqnHNu6s-Q9yD9bweBEUpdmsyCBPkwILMZ3XVZ1OC4FVPpsVGbKCz2dLvsg5my2LOT4b9p3ZPXpy-1AsdotZ3OsHbZ50rKTuD_Fe96NcY5EJaI1ARdKdcfIwfNpq55lSKEoZ4helG9P5AbjtaHCUbiqpB_kbo2u5722IENRS4U-7Bp9or6xeakg3F2mb8Hr_JmZONmOoNODMoQPfMP9WEVHyFIUBBPMM_LFDB0o-IIH2t8uT7FOmunQgjdINnT-sdVa248Ku91K5He-dN-0urIRTJ03XRWn2ZRpKbAIbYwEPrO0UjhV1uYVZy45hY5TdSO1JZJpE2SeqMqlJPMgJfJQcd44zHUr2txRl6buKLBPysHPGejLyFGrpgB6qGcorU0O8Le6l82gdtOwIrpNKkfqq9-F7lOYOKCFMPbGjC0xE2Zq8zuMFC9A5lkmHYjSON8gf1kLYb1TeW_cvpqTYGPtFVpwp4j8C2AeZHPF1WVkX6RypL91M88nhQjpA88YI_IyEy2-okJPr5frzq9cJH_KaL-bFpWNbqE2vBTDgpu8UgqkDIzJLyyNiG1Td4KqjTjJgOUqLIYR4BGYxhLDXksiFKSru7ESm6ZrYT-o94KFTkksP2uj4RPfcdEfg3tjA0qZDyzzRYElk9j5Ohj5gLDjZSsXsUDXh4OvzwR21R72HjlnWYkCBqcH1vBmrzBuoex3C5KA6gsUaLWpOHOo8EY6paf2RqR7Pqk-osEhYYGCxs0b0HC3JGx0CeWqbodcGhlkkZ8HYGm472cZUFnF1jB_wGNI0VIgMTPFTr32bbN7nuxdsBXG52_2xvdvdfVl_33y9v92tb8vdLsrKKcTSHZ3H9ifylJqrXiDEXx3E5T_LT9c_PkPc0hkQs1bsuz5GZtUxllpJjTEBPiu97fG12CnSMRWFi7KyZor61gEa2UEcm7pWhomYWd5EWbmvD8ukiLL1QTP-EEPsvIiykg_OTGcQG_hFPA3E_FcyFPP3KXwLru864xDkmSse0R6JteteQW0sCKz6_Z6AxhxlrqcKYaCMJxyENh1GrRMinIEtcIvMowCmjW_QUtJbAqaTRLf2ElOh5RMzORZGuYY9yrFgGtORGep4_m6NIfD0Dn8GnfNMC6aMxhFrI89tPQiDg3fUjpg-Ah48Ws2UAzxw7Dz8sb0jK4-mH7DZSyXI3b8BMP8aCF1E5zVyXgXuw55vMVBiBg9oNapAlr2mMQ51oMz2NUFGi2SQ3VVMELeFCjlB45miQsaJ09B9TI3PjDXO1iOHWOCWuQbdC5I8H743RoQR-RkLZEcAwzNLksqRmQgA_gNV6QeeUPP4TU_OnEyMr183qfEK8aZNaxr13wjuWXSorZN5Z7VUOuecfWRiYIXn1ja0M6Odtz11tZcXi_n1r8lo4g8-mpcfz-VvDww0yjqaE2bLeb4oijA8_NZ5p2n8pYUvQf-XGbacvjTsjWOe7bn4vRKrTCyzJbvC1TSfpstpsVjkV80qycWUz6qCp4spT2a1SNk8Y4uqntdiukj4lVylSTpLZtN0msxmaT7Jc5HXRT7NcsyLWtTRLMGWSTUhuyfG7q_CNXJVFLNlfqVYhcqFG3aajrSV0mXbroKfVb930SxR0nl31uClV-Favr4tP9_9iOZloCoCXyBnuCRAb4bSHC_Uafr_uHSm6VVv1ep_zt7pSr0J4fhvAAAA___Gdl-R">