<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/65806>65806</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
clang: CUDA error: compiling relocatable device code with dynamically allocated shared memory passed through a lambda expression will cause an error.
</td>
</tr>
<tr>
<th>Labels</th>
<td>
clang,
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
jacobtrombetta
</td>
</tr>
</table>
<pre>
# Overview
Compiling relocatable device code (`-fcuda-rdc`) with dynamically allocated shared memory (`extern __shared__`) passed through a lambda expression will cause a `fatal: Variable used as initial value not in .global or .const state space` error in the clang-18 compiler.
The work around is to pass an object that points to the dynamically allocated shared memory object.
# Error
```
clang++ -x cuda -c main.cc -o main.o --cuda-gpu-arch=sm_70 -std=c++20 -fcuda-rdc
clang++: warning: CUDA version 12.1 is only partially supported [-Wunknown-cuda-version]
ptxas /tmp/main-sm_70-a10293.s, line 51; fatal : Variable used as initial value not in .global or .const state space
ptxas fatal : Ptx assembly aborted due to errors
clang++: error: ptxas command failed with exit code 255 (use -v to see invocation)
Ubuntu clang version 18.0.0 (++20230908042326+cf51876dd909-1~exp1~20230908042444.1172)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang++: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang++: note: diagnostic msg: /tmp/main-42ef64.cu
clang++: note: diagnostic msg: /tmp/main-sm_70-d402df.cu
clang++: note: diagnostic msg: /tmp/main-42ef64.sh
clang++: note: diagnostic msg:
********************
```
# Isolated example that reproduces the error
```
template <class F>
__host__ __device__ void launch_kernel(unsigned int thread_id, F f) noexcept {
switch (thread_id) {
case 0: {
return f(std::integral_constant<unsigned int, 0>{});
}
case 1: {
return f(std::integral_constant<unsigned int, 1>{});
}
}
__builtin_unreachable();
}
template <typename T, unsigned int thread_id>
__device__ void dynamically_allocated_input_from_lambda(T* shared_data){
shared_data[thread_id] = thread_id;
}
template <typename T>
__global__ void kernel() {
/////////////////////////////////////////////////////////
// Source of relocatable device code compilation error //
/////////////////////////////////////////////////////////
extern __shared__ T shared_data[];
/////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////
////////////////
// Workaround //
////////////////
//extern __shared__ T shared_data_d[];
//T* shared_data = shared_data_d;
////////////////
////////////////
////////////////
launch_kernel(
threadIdx.x,
[=]<unsigned thread_id>(std::integral_constant<unsigned, thread_id>) noexcept {
dynamically_allocated_input_from_lambda<T, thread_id>(shared_data);
});
}
int main()
{
using T = unsigned int;
kernel<T><<<2, 2, sizeof(T) * 2>>>();
cudaDeviceSynchronize();
}
```
# Supporting files
The code above, requested files for the bug report, and supporting files for reproducing the issue in a Docker container are attached.
[clang_rdc_issue.zip](https://github.com/llvm/llvm-project/files/12563670/clang_rdc_issue.zip)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzcWEtv27gW_jX05kCGRNmKvfAiju2ZAMVNMXFvlwJFHVtsKFKXpByni_ntF6TkV5qgSVtg0AEEWw-ejx_Pm2TWiq1CnJHxnIwXA9a6SpvZF8Z14YyuC3SODQpdPs0ITeFuh2Yn8JHECxJf3-i6EVKoLRiUmjPHColQ4k5wBK5LBEInJIujDW9LFpmSkywmdAqPwlVQPilWC86kfAImgzyWYCtmsIQaa22eenncOzQK8rz7mOc9TMOsxRJcZXS7rYCBZHVRMsB9Y9BaoRU8CimBs9YiMCBZvGGOSZJew3-ZEYFu6yGYBaGEE0zCjskWQWkHQsFwK3XBJGgDQ66VdWAdcwi2YRxJFgMao40f6SoELpnaRskEeFAMmiF0iup-1xXCozYPwIxuVQnCgtNhEcAU6OILcgeuYg4aLZQLXz3sWxTVSQ_Pp_P2Wnp6_WMW91d4DFwJnRM6h2gP3kAQcaiZUEPOIdLdrYYoCsbbNm3EDK9IurB1fhVDZF1J0gXvMGgMZ1Z-PoNX-CMzSqitv735tLiGHZpgoYQOE68KreQTNMx4I8gnsG3TaOMXSsbz6HOrHpR-VB2XXpSMF91MjdszC4SuXN0QuvLEo0AyYklMp-nQEnoDUiiEcULSOQQvgF_lBmccOmAI0B_dHryD1oW3W9EtpmzRmzW4jX1RT-GTv-kQua5rpkrYMCGx7CIH98J18UXHYx8j3r2jnQe2iCDUzvuI1xCddnN8Klrl2s5BT5qfDONhHGKstyFN42k8iUc0pRmhc74ZJ5OrrCyn8TRK_sZ9k_x9Nmg0Gg2T5IoeZ1kzs0Xnue8nWZ6NooZHUqh2H21Ve4gBg6yEWpcYwrDRVuy7T7fKOiYllgsR1k_oqrWG0FUh1IuaUtqh_y8F2yptneBQ2-BghwD4iesskD5-WF7fL-F6vb6--RPWfy5hdffhw93n2__8AavbD8t7WN-F1_NPf8Bfy493f61JepA12BjNMSQqq1vDkdCJ9dnLW5VZq7kI8WxaBZYb0bjjAINwiHbmjpBv18JFQIwobrLRkLc_hdKFVTmKabn5Wayeka1-zLq_ysaXefGUO2-tlkH3uGd1I7FLzcGeZcvRhtyMryZYh3Xj5YGkN1z6JL8i6bInn-eVti7PIc-7cpnnsNOiBMlaxav8AY1C6SNbhfpcglC-NvjgyUXp09kKNt5JlMY9x8YBuZp32AD2UThe-cA-k5iej-DMIsRBmaeXAAZda5QHnoTsfk3Sa6Ecbg2TeUh8TDmS3pyz8lxiki490NXC54L0iOhfnM-Y_LIZk-_PeHab50UrpBMqb5VBxiuf8kPeO8keh39rPvfUoGI1wtpP_YpJ0uXBtJcWPave-bF650I1rcs3Rtd517QQOlkTet2X9bxkjnl2R12dvx_PT9OOF0DSxTmPd63nxLorcQfWRwe8dBxCV7_NdXCD8AT3IfeC3rzaq3ZNW6ibfVt3AfTPL-jdS_-mb4b1Mz_yLVT6-xr3n-fxr2f8o7wvo--zNg_9ruel7z-K-x0Hz8vnLt7LPc-0IYVeCv54VPySlb1brv973j-cKi30NeK23A_3hN6cf_FqShdBU6dKe1Ha3liefX28lHuxQ3lrTUxv1t8iTp6VyJOlLpuBZwXQV2vfdvZlvx9ydIzWCrWFdfCEi2bjCN_r1HNKlyS96S7qCYYfK76i3nR1fArew2gYuOxYX1DrEP1mdhEq0P2T4pXRSnx9vS15rU-973bKnv5GSLSnw4ZQ1lihd-j5Gfxfi9Y3tGEYbLQJLWzRbn1Xq01orPyuxD5DDEMPja9_68WEta3fawKDheYPaIBr5ZhQaMLOhTnHeIXl8QxkPA8dfm5Kngfh4VfReJ-jk8q5xnrvCh69Fa5qiyHXNaErKXeHv6gx-gtyR-iqWyhdJXScpdlVTOjqJfCDnQflLC2n6ZQNcJZk0xGNr2gcD6oZHdN0w2POcDSdMr8fSWkaZ1csy3jJRuOBmB02vDSJs1E8TIpNkkxGV5M03aSYcjKKsWZCDj3DoTbbQZh-lo0ncTaQrEBpw-Eapf0GxzsLoVThY6dD_2a8GJhZWGPRbi0ZxVJYZ0-YTjiJsw7gcIByPCjg3z2Ge_N52ztP1PpGaThojZy924hh9d6KQVf_DwAA__-4qNgX">