[PATCH] D100394: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions
Artem Belevich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 20 10:02:02 PDT 2021
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM overall, modulo few test and naming nits.
================
Comment at: llvm/lib/Target/NVPTX/NVPTXIntrinsics.td:415
+ (ins Int32Regs:$addr, Int32Regs:$count),
+ !strconcat("mbarrier.arrive.noComplete", AddrSpace,
+ ".b64 $state, [$addr], $count;"),
----------------
Does ptxas accept all-lower-case `nocomplete`?
The `no*C*omplete` stands out as a sore thumb. Capital letters are used in few LLVM intrinsics, so it's not a showstopper, but I think lower case everywhere makes more sense. WDYT?
================
Comment at: llvm/test/CodeGen/NVPTX/async-copy.ll:40
+
+define void @asyncmbarrier(i64* %a) {
+; CHECK_PTX32: cp.async.mbarrier.arrive.b64 [%r{{[0-9]+}}];
----------------
All functions in the file should use `-LABEL` checks.
================
Comment at: llvm/test/CodeGen/NVPTX/mbarrier.ll:7
+
+define void @barrierinit(i64* %a, i32 %b) {
+; CHECK_PTX32: mbarrier.init.b64 [%r{{[0-9]+}}], %r{{[0-9]+}};
----------------
Same here. Please add `-LABEL` checks for all functions.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D100394/new/
https://reviews.llvm.org/D100394
More information about the llvm-commits
mailing list