<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/125473>125473</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [clang, CUDA] Binary compiled with clang produces wrong output
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            clang
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          LukasBreitwieser
      </td>
    </tr>
</table>

<pre>
    Hi All,

If I compile a simple CUDA kernel (see `main.cu`) and execute the created binary, I receive a wrong result starting from commit: [953beb9fe969bf8ab1857924ea0d3dd6ea506ab1](https://github.com/llvm/llvm-project/commit/953beb9fe969bf8ab1857924ea0d3dd6ea506ab1)
Expected: exit code 0 with output: `Test passed`. 
Actual: exit code 1 with output: 
```
Expected 42, but actual is: 0
Test FAILED
```
 
File: `main.cu`
```
#include <cstdlib>
#include <iostream>

// Define CUDA Kernel
__global__ void Kernel(int *out) { *out = 42; }

int main() {
  // Execute CUDA Kernel
  int *deviceOutput = nullptr;
  int hostOutput = 0;
 cudaMalloc((void **)&deviceOutput, sizeof(int));
  Kernel<<<1, 1>>>(deviceOutput);
  cudaGetLastError();
  cudaDeviceSynchronize();
 cudaMemcpy(&hostOutput, deviceOutput, sizeof(int), cudaMemcpyDeviceToHost);

  // Verify result
  if (hostOutput != 42) {
 std::cout << "Expected 42, but actual is: " << hostOutput << std::endl;
 std::cout << "Test FAILED" << std::endl;
    std::exit(1);
  }
 std::cout << "Test passed" << std::endl;
  return 0;
}
```
Compilation of `main.cu`
```
  clang++ \
    -x cuda \
    -lcuda \
    -lcudart \
    -o main \
    main.cu
```

LLVM configuration
```bash
  cmake -G Ninja \
    -DCMAKE_BUILD_TYPE=Release \
 -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
    -DLLVM_ENABLE_PROJECTS="clang" \
    -DLLVM_TARGETS_TO_BUILD="host;NVPTX" \
 <path-to-llvm-project-dir>/llvm
```

System Info
- OS: `Ubuntu 20.04.1 LTS`
- Compiler used to compile llvm: `g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0`
- CUDA: `12.6`

I am happy to provide more information if needed.
Thank you in advance for taking a look.

Best,
Lukas
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyUVl-P2r4S_TTmZQRyHALhgYdAoOWW_aNdtup9Qk48AXedGNnOdunD_exXTsJu6Pa29ydFQthn5sycmbHNrZWHCnFOogWJ0gGv3VGb-bZ-5nZhULofEi2aQabFef5ZQqIUYUtCE0KTTQEbyHV5kgqBg5XlSSEsn9IEntFUqICw2CICmdCSy2qU12RCCZsBrwTgK-a1Q3BHhNwgdyggkxU3Z8KWsAGDOcoX7_iH0dUBDNpaObCOGyerAxRGl569lI6ECZBoMYvCDLNZgbPJLCtingVxNJ2xMXIqQiEmyCM64VlAopSw-OjcyZIwIWxN2Pog3bHORrkuCVsr9XL5GZ6M_o65I2zdcbH1_83DZoQmq9cT5g6FDxJfpYNcCwQKP6Q7gq7dqW7jn9AdWgcnbi0KMqEjIDRJcldzdW0afDClide1_d4JYcy8klntgDduQPp8wWMaqnWy2a7SX6y9t7VU2MXUq9s1jrBQVrmqBQIJl7l1QsmMhKsPW1JbZ5CX3R7tBIcUC1l17fKlaRdCk_3-oHTG1X4PL1qKywaLZeWAsETXzvcPmS66f0DC1CcaLoBM05bAY33chMUd2OcFHe-q67trYoCOQeCLzPGuEbdxXtVKnZwh4eINdtTW9RC028trwW-4UjpvmOMmA8KS5psRNun79qWx8ifqos2uQcwuLF1c4bL9Ao8OvILtx-JrV292PoRP6LbcupUx2rQS9HfTxvDxXOVHoyv5E68gTQpY5qdzszx5z9RH8Nf4lz0HLdFOf9b2PcJ-Ib6ikcW5m-tW28KfGH11WdDV972O1vlRImGSt-X3-gBh7G9tTxi7oK_q16y8OcVKqE6L_0XUn513n7_zANBbfvVnRxz0ytE27J95uuPgLzwGXW2qSyN2g9Ab1mVzSHMndQW6-ONcA-SKVwfCFoQtgETLLpHha1Pb_or67YJx_SXdTGJv5UL8y2lCk-326w3kuirkoTZNqH1Mxu2xja7kzwjDT3Arq-9X7OnyJvmy2i-eNtt0v_v3_YqE6QMq5BYvsAtm9e3-7mG3X97d3G-2K_97k9ymjyRM726vXPqg9qvbZLFd7e8f7v61Wu48ijDWqcQ-wnfJw6fV7nG_u2tjafG-50i4uP16v_vWMyPh8sTdcej0sH_dDIU0zaC3d9EHsR7P1mEJm6rQhCZDuHvszuunrK5cDYyO6HgUwHb32NoMoe0BNFBbFOD0283dMLTWb2VncecoCEfhiA4ndfOX_YeNR3Tsp7HdeHP-lCadj4CNJm9xboCXcOSn09kznox-kQKh1AZBVoU2ZduTsoAKUaAY-bvpyKtnOOsaZAVcvPAqRyi0Acef_cXPQWn9PGoJFuhPFy9l82QZiHkoZuGMD3AeTMM4msaTKBoc53TGUYzzfDoRPI9jmkdxEWRhFnHMxXg8Gcg5oyyijIYBDWkUjTI6zaYsCoJ4NkVKQzKmWHKpRl6vkTaHgbS2xnnAovE0HCieobLNM-q9OfyLysybwmb1wZIxVdI6--7CSaeat1dnsWyVjFJYNE-hS5FEe-c3KC-jqHO03cOofQgMaqPm__hN06RgCVt3WbzM2X8DAAD__1zi9NA">