<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Certain CUDA codes produce "invalid device function" - appears to be fixed in trunk"
   href="https://bugs.llvm.org/show_bug.cgi?id=41597">41597</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Certain CUDA codes produce "invalid device function" - appears to be fixed in trunk
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>8.0
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>-New Bugs
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>philip.salzmann@uibk.ac.at
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>htmldeveloper@gmail.com, llvm-bugs@lists.llvm.org, neeilans@live.com, richard-llvm@metafoo.co.uk
          </td>
        </tr></table>
      <p>
        <div>
        <pre>For certain CUDA codes Clang 8 will produce an executable that causes the error

<span class="quote">> cudaErrorInvalidDeviceFunction (error 8) due to "invalid device function" on CUDA API call to cudaLaunch.</span >

It's difficult to pinpoint exactly what conditions cause this behavior, as it
strikes seemingly "at random". I've encountered this in the context of hipSYCL,
a SYCL implementation based on HIP (however the code is being compiled as CUDA
in this case). Here you can find the original issue with a smallish demo:
<a href="https://github.com/illuhad/hipSYCL/issues/49">https://github.com/illuhad/hipSYCL/issues/49</a>. Of course this unfortunately
includes all of the added complexity surrounding hipSYCL.

This has so far been reproduced using Clang 8 and CUDA 9.2 on Arch, as well as
using Clang 8 and CUDA 10.0 on Ubuntu.

I'd like to provide a self-contained demo, however compiling the test case with
`-save-temps` produces two files, one `*-cuda-nvptx64-nvidia-cuda-sm_52.cui`
and another `-host-x86_64-pc-linux-gnu.cui`. These files are rather large, and
I'm not sure if there is a way of feeding them both back into Clang to be able
to reduce their size with delta. Any advice is welcome.

The good news: The issue appears to have been fixed in trunk, and I've narrowed
the fix down to <a href="https://reviews.llvm.org/D58163">https://reviews.llvm.org/D58163</a>. That being said, the assertion
at the top of `CGNVCUDARuntime::emitDeviceStub` still fails (i.e., the demo
works when compiled with a release build), which is not surprising as the
commit was meant to address something seemingly unrelated.

The assertion in question is

<span class="quote">> llvm-project/clang/lib/CodeGen/CGCUDANV.cpp:228: virtual void {anonymous}::CGNVCUDARuntime::emitDeviceStub(clang::CodeGen::CodeGenFunction&, clang::CodeGen::FunctionArgList&): Assertion `getDeviceSideName(CGF.CurFuncDecl) == CGF.CurFn->getName() || getDeviceSideName(CGF.CurFuncDecl) + ".stub" == CGF.CurFn->getName() || CGF.CGM.getContext().getTargetInfo().getCXXABI() != CGF.CGM.getContext().getAuxTargetInfo()->getCXXABI()' failed.</span >

and it fails in the degenerate case (see GitHub issue) because the mangled
names differ right after the $:

<span class="quote">> _ZN2cl4sycl6detail8dispatch19parallel_for_kernelILi1EZZ4mainENK3$_0clERNS0_7handlerEEUlNS0_4itemILi1ELb1EEEE_EEvT0_NS0_5rangeIXT_EEE
> _ZN2cl4sycl6detail8dispatch19parallel_for_kernelILi1EZZ4mainENK3$_1clERNS0_7handlerEEUlNS0_4itemILi1ELb1EEEE_EEvT0_NS0_5rangeIXT_EEE</span >

whereas for the working example they are the same.

Please let me know if there are any additional steps I can take to provide more
context.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>