<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Missing GPU Builtins when compiling CUDA to bitcode on Windows"
   href="https://bugs.llvm.org/show_bug.cgi?id=52224">52224</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Missing GPU Builtins when compiling CUDA to bitcode on Windows
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>12.0
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Windows NT
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>release blocker
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>CUDA
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>rmohamme@mathworks.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>When compiling CUDA device code on Windows to bitcode using clang++, it seems
that implementations are missing for a couple of GPU builtins for input type
float. Below is a list of functions I found for which the implementations are
missing.

round, erf, erfc, hypot, expm1, log2, log1p, asinh, acosh and atanh.

The list is not exhaustive and there may be other functions also. Below is the
reproduction code and command that can be used.

File: repro.cu
Code:
extern "C" __device__ float reprof(float in) {
    return ::round(in);
}

Compilation Command:
clang++ --cuda-gpu-arch=sm_50 --cuda-path=<PATH_TO_CUDA> --cuda-device-only -O2
-S -emit-llvm repro.cu -o repro.ll</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>