[all-commits] [llvm/llvm-project] abd8cd: [CUDA][HIP] Fix linkage for -fgpu-rdc
Yaxun (Sam) Liu via All-commits
all-commits at lists.llvm.org
Tue Nov 3 05:08:03 PST 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: abd8cd9199d1e14cae961e1067b78df7044179a3
https://github.com/llvm/llvm-project/commit/abd8cd9199d1e14cae961e1067b78df7044179a3
Author: Yaxun (Sam) Liu <yaxun.liu at amd.com>
Date: 2020-11-03 (Tue, 03 Nov 2020)
Changed paths:
M clang/lib/CodeGen/CodeGenModule.cpp
A clang/test/CodeGenCUDA/device-fun-linkage.cu
Log Message:
-----------
[CUDA][HIP] Fix linkage for -fgpu-rdc
Currently for explicit template function instantiation in CUDA/HIP device
compilation clang emits instantiated kernel with external linkage
and instantiated device function with internal linkage.
This is fine for -fno-gpu-rdc since there is only one TU.
However this causes duplicate symbols for kernels for -fgpu-rdc if
the same instantiation happen in multiple TU. Or missing symbols
if a device function calls an explicitly instantiated template function
in a different TU.
To make explicit template function instantiation work for
-fgpu-rdc we need to follow the C++ linkage paradigm, i.e.
use weak_odr linkage.
Differential Revision: https://reviews.llvm.org/D90311
More information about the All-commits
mailing list