[PATCH] D80464: [CUDA] Missing __syncthreads intrinsic in __clang_cuda_device_functions.h

Boris Staletic via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat May 23 01:01:35 PDT 2020


bstaletic marked an inline comment as done.
bstaletic added a comment.

This doesn't seem to actually compile:

  In file included from <built-in>:1:
  In file included from /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang/llvm-project/clang/test/Headers/../../lib/Headers/openmp_wrappers/__clang_openmp_device_functions.h:29:
  /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang/llvm-project/build/lib/clang/11.0.0/include/__clang_cuda_device_functions.h:522:16: error: static declaration of '__syncthreads' follows non-static declaration
  __DEVICE__ int __syncthreads(void) { return __nvvm_bar0(); }
                 ^
  /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang/llvm-project/build/lib/clang/11.0.0/include/__clang_cuda_device_functions.h:522:16: note: previous implicit declaration is here
  /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang/llvm-project/build/lib/clang/11.0.0/include/__clang_cuda_device_functions.h:522:45: error: use of undeclared identifier '__nvvm_bar0'
  __DEVICE__ int __syncthreads(void) { return __nvvm_bar0(); }
                                              ^

Looks like `__nvvm_bar0` is not declared/defined anywhere. When grepping, compared to `__nvvm_bar0_and`, these two are missing:

  llvm/include/llvm/IR/IntrinsicsNVVM.td
  1034:  def int_nvvm_barrier0_and : GCCBuiltin<"__nvvm_bar0_and">,
  
  clang/include/clang/Basic/BuiltinsNVPTX.def
  408:BUILTIN(__nvvm_bar0_and, "ii", "")

Should I add `BUILTIN(__nvvm_bar0, "v", "")` to `BuiltinsNVPTX.def` and whatever needs to be added to the `IntrinsicsNVVM.td`?



================
Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:522
 __DEVICE__ float __sinf(float __a) { return __nv_fast_sinf(__a); }
+__DEVICE__ int __syncthreads(void) { return __nvvm_bar0(); }
 __DEVICE__ int __syncthreads_and(int __a) { return __nvvm_bar0_and(__a); }
----------------
This doesn't seem to actually compile:

```
In file included from <built-in>:1:
In file included from /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang/llvm-project/clang/test/Headers/../../lib/Headers/openmp_wrappers/__clang_openmp_device_functions.h:29:
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang/llvm-project/build/lib/clang/11.0.0/include/__clang_cuda_device_functions.h:522:16: error: static declaration of '__syncthreads' follows non-static declaration
__DEVICE__ int __syncthreads(void) { return __nvvm_bar0(); }
               ^
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang/llvm-project/build/lib/clang/11.0.0/include/__clang_cuda_device_functions.h:522:16: note: previous implicit declaration is here
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang/llvm-project/build/lib/clang/11.0.0/include/__clang_cuda_device_functions.h:522:45: error: use of undeclared identifier '__nvvm_bar0'
__DEVICE__ int __syncthreads(void) { return __nvvm_bar0(); }
                                            ^
```

Looks like `__nvvm_bar0` is not declared/defined anywhere. When grepping, compared to `__nvvm_bar0_and(int)`, these two are missing:

```
llvm/include/llvm/IR/IntrinsicsNVVM.td
1034:  def int_nvvm_barrier0_and : GCCBuiltin<"__nvvm_bar0_and">,
clang/include/clang/Basic/BuiltinsNVPTX.def
408:BUILTIN(__nvvm_bar0_and, "ii", "")
```

Should I add `BUILTIN(__nvvm_bar0, "v", "")` to `BuiltinsNVPTX.def` and whatever needs to be added to the `IntrinsicsNVVM.td`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80464/new/

https://reviews.llvm.org/D80464





More information about the cfe-commits mailing list