<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Sep 22, 2015 at 11:49 AM, Richard Smith <span dir="ltr"><<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><p dir="ltr">It seems like the real problem here is that we're giving the template instantiation the wrong linkage. It can be used from outside this llvm module, so it should be weak_odr instead of linkonce_odr.</p></blockquote><div>This indeed works much better. I've just sent <a href="http://reviews.llvm.org/D13067">http://reviews.llvm.org/D13067</a></div><div><br></div><div>--Artem <br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class=""><div class="h5">
<div class="gmail_quote">On Sep 22, 2015 10:24 AM, "Artem Belevich via cfe-commits" <<a href="mailto:cfe-commits@lists.llvm.org" target="_blank">cfe-commits@lists.llvm.org</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Author: tra<br>
Date: Tue Sep 22 12:22:51 2015<br>
New Revision: 248293<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=248293&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=248293&view=rev</a><br>
Log:<br>
[CUDA] Add implicit __attribute__((used)) to all __global__ functions.<br>
<br>
This makes sure that we emit kernels that were instantiated from the<br>
host code and which would never be explicitly referenced by anything<br>
else on device side.<br>
<br>
Differential Revision: <a href="http://reviews.llvm.org/D11666" rel="noreferrer" target="_blank">http://reviews.llvm.org/D11666</a><br>
<br>
Modified:<br>
    cfe/trunk/lib/Sema/SemaDeclAttr.cpp<br>
    cfe/trunk/test/CodeGenCUDA/<a href="http://ptx-kernels.cu" rel="noreferrer" target="_blank">ptx-kernels.cu</a><br>
<br>
Modified: cfe/trunk/lib/Sema/SemaDeclAttr.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaDeclAttr.cpp?rev=248293&r1=248292&r2=248293&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaDeclAttr.cpp?rev=248293&r1=248292&r2=248293&view=diff</a><br>
==============================================================================<br>
--- cfe/trunk/lib/Sema/SemaDeclAttr.cpp (original)<br>
+++ cfe/trunk/lib/Sema/SemaDeclAttr.cpp Tue Sep 22 12:22:51 2015<br>
@@ -3350,6 +3350,10 @@ static void handleGlobalAttr(Sema &S, De<br>
   D->addAttr(::new (S.Context)<br>
               CUDAGlobalAttr(Attr.getRange(), S.Context,<br>
                              Attr.getAttributeSpellingListIndex()));<br>
+<br>
+  // Add implicit attribute((used)) so we don't eliminate kernels<br>
+  // because there is nothing referencing them on device side.<br>
+  D->addAttr(UsedAttr::CreateImplicit(S.Context));<br>
 }<br>
<br>
 static void handleGNUInlineAttr(Sema &S, Decl *D, const AttributeList &Attr) {<br>
<br>
Modified: cfe/trunk/test/CodeGenCUDA/<a href="http://ptx-kernels.cu" rel="noreferrer" target="_blank">ptx-kernels.cu</a><br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCUDA/ptx-kernels.cu?rev=248293&r1=248292&r2=248293&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCUDA/ptx-kernels.cu?rev=248293&r1=248292&r2=248293&view=diff</a><br>
==============================================================================<br>
--- cfe/trunk/test/CodeGenCUDA/<a href="http://ptx-kernels.cu" rel="noreferrer" target="_blank">ptx-kernels.cu</a> (original)<br>
+++ cfe/trunk/test/CodeGenCUDA/<a href="http://ptx-kernels.cu" rel="noreferrer" target="_blank">ptx-kernels.cu</a> Tue Sep 22 12:22:51 2015<br>
@@ -1,7 +1,16 @@<br>
+// Make sure that __global__ functions are emitted along with correct<br>
+// annotations and are added to @llvm.used to prevent their elimination.<br>
+// REQUIRES: nvptx-registered-target<br>
+//<br>
 // RUN: %clang_cc1 %s -triple nvptx-unknown-unknown -fcuda-is-device -emit-llvm -o - | FileCheck %s<br>
<br>
 #include "Inputs/cuda.h"<br>
<br>
+// Make sure that all __global__ functions are added to @llvm.used<br>
+// CHECK: @llvm.used = appending global<br>
+// CHECK-SAME: @global_function<br>
+// CHECK-SAME: @_Z16templated_kernelIiEvT_<br>
+<br>
 // CHECK-LABEL: define void @device_function<br>
 extern "C"<br>
 __device__ void device_function() {}<br>
@@ -13,4 +22,10 @@ __global__ void global_function() {<br>
   device_function();<br>
 }<br>
<br>
+// Make sure host-instantiated kernels are preserved on device side.<br>
+template <typename T> __global__ void templated_kernel(T param) {}<br>
+// CHECK-LABEL: define linkonce_odr void @_Z16templated_kernelIiEvT_<br>
+void host_function() { templated_kernel<<<0,0>>>(0); }<br>
+<br>
 // CHECK: !{{[0-9]+}} = !{void ()* @global_function, !"kernel", i32 1}<br>
+// CHECK: !{{[0-9]+}} = !{void (i32)* @_Z16templated_kernelIiEvT_, !"kernel", i32 1}<br>
<br>
<br>
_______________________________________________<br>
cfe-commits mailing list<br>
<a href="mailto:cfe-commits@lists.llvm.org" target="_blank">cfe-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits</a><br>
</blockquote></div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">--Artem Belevich</div></div>
</div></div>