[PATCH] D38090: [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.
Justin Lebar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 20 11:33:09 PDT 2017
jlebar added inline comments.
================
Comment at: clang/lib/Headers/__clang_cuda_intrinsics.h:161
+#endif // __CUDA_VERSION >= 9000 && (!defined(__CUDA_ARCH__) || __CUDA_ARCH__ >=
+ // 300)
+
----------------
Nit, better linebreaking in the comment?
================
Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:3744
+ Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty, llvm_i32_ty, llvm_i32_ty],
+ [IntrNoMem], "llvm.nvvm.shfl.sync.down.i32">,
+ GCCBuiltin<"__nvvm_shfl_sync_down_i32">;
----------------
IntrConvergent?
https://reviews.llvm.org/D38090
More information about the llvm-commits
mailing list