[PATCH] D66665: [CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Sep 3 10:34:43 PDT 2019


This revision was automatically updated to reflect the committed changes.
Closed by commit rL370792: [CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+ (authored by tra, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D66665?vs=216886&id=218486#toc

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66665/new/

https://reviews.llvm.org/D66665

Files:
  cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h


Index: cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h
===================================================================
--- cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h
+++ cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h
@@ -211,7 +211,15 @@
   return __nvvm_vote_ballot_sync(mask, pred);
 }
 
-inline __device__ unsigned int __activemask() { return __nvvm_vote_ballot(1); }
+inline __device__ unsigned int __activemask() {
+#if CUDA_VERSION < 9020
+  return __nvvm_vote_ballot(1);
+#else
+  unsigned int mask;
+  asm volatile("activemask.b32 %0;" : "=r"(mask));
+  return mask;
+#endif
+}
 
 inline __device__ unsigned int __fns(unsigned mask, unsigned base, int offset) {
   return __nvvm_fns(mask, base, offset);


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D66665.218486.patch
Type: text/x-patch
Size: 722 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20190903/a165cbfe/attachment.bin>


More information about the cfe-commits mailing list