[PATCH] D100394: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 20 10:02:02 PDT 2021


tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

LGTM overall, modulo few test and naming nits.



================
Comment at: llvm/lib/Target/NVPTX/NVPTXIntrinsics.td:415
+           (ins Int32Regs:$addr, Int32Regs:$count),
+           !strconcat("mbarrier.arrive.noComplete", AddrSpace,
+                      ".b64 $state, [$addr], $count;"),
----------------
Does ptxas accept all-lower-case `nocomplete`?

The `no*C*omplete` stands out as a sore thumb. Capital letters are used in few LLVM intrinsics, so it's not a showstopper, but I think lower case everywhere makes more sense. WDYT?



================
Comment at: llvm/test/CodeGen/NVPTX/async-copy.ll:40
+
+define void @asyncmbarrier(i64* %a) {
+; CHECK_PTX32: cp.async.mbarrier.arrive.b64 [%r{{[0-9]+}}];
----------------
All functions in the file should use `-LABEL` checks.


================
Comment at: llvm/test/CodeGen/NVPTX/mbarrier.ll:7
+
+define void @barrierinit(i64* %a, i32 %b) {
+; CHECK_PTX32: mbarrier.init.b64 [%r{{[0-9]+}}], %r{{[0-9]+}};
----------------
Same here. Please add `-LABEL` checks for all functions.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100394/new/

https://reviews.llvm.org/D100394



More information about the llvm-commits mailing list