[all-commits] [llvm/llvm-project] abd8cd: [CUDA][HIP] Fix linkage for -fgpu-rdc

Yaxun (Sam) Liu via All-commits all-commits at lists.llvm.org
Tue Nov 3 05:08:03 PST 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: abd8cd9199d1e14cae961e1067b78df7044179a3
      https://github.com/llvm/llvm-project/commit/abd8cd9199d1e14cae961e1067b78df7044179a3
  Author: Yaxun (Sam) Liu <yaxun.liu at amd.com>
  Date:   2020-11-03 (Tue, 03 Nov 2020)

  Changed paths:
    M clang/lib/CodeGen/CodeGenModule.cpp
    A clang/test/CodeGenCUDA/device-fun-linkage.cu

  Log Message:
  -----------
  [CUDA][HIP] Fix linkage for -fgpu-rdc

Currently for explicit template function instantiation in CUDA/HIP device
compilation clang emits instantiated kernel with external linkage
and instantiated device function with internal linkage.

This is fine for -fno-gpu-rdc since there is only one TU.

However this causes duplicate symbols for kernels for -fgpu-rdc if
the same instantiation happen in multiple TU. Or missing symbols
if a device function calls an explicitly instantiated template function
in a different TU.

To make explicit template function instantiation work for
-fgpu-rdc we need to follow the C++ linkage paradigm, i.e.
use weak_odr linkage.

Differential Revision: https://reviews.llvm.org/D90311




More information about the All-commits mailing list