[llvm] [LLVM][NVPTX] Add NVPTX codegen support for clusterlaunchcontrol instruction (PR #134568)

Alex MacLean via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 7 09:19:58 PDT 2025


================
@@ -7301,3 +7301,67 @@ def INT_NVVM_ST_BULK_SHARED_CTA:
             "st.bulk.shared::cta [$dest_addr], $size, 0;",
             [(int_nvvm_st_bulk_shared_cta addr:$dest_addr, i64:$size, (i64 0))]>,
             Requires<[hasSM<100>, hasPTX<86>]>;
+
+// clusterlaunchcontrol Instructions
+
+multiclass CLUSTERLAUNCHCONTROL_TRY_CANCEL<Intrinsic Intrin, bit Multicast, list<Predicate> Pred> {
+  defvar Intrinsic = !cast<Intrinsic>(Intrin # !if(!eq(Multicast, 1), "_multicast", ""));
+  defvar IntrinsicShared = !cast<Intrinsic>(Intrin # !if(!eq(Multicast, 1), "_multicast", "") # "_shared");
+  defvar MulticastMod = ".multicast::cluster::all";
+
+  def _gen: NVPTXInst<(outs), (ins Int64Regs:$addr, Int64Regs:$mbar),
+                      "clusterlaunchcontrol.try_cancel.async.mbarrier::complete_tx::bytes" #
+                      !if(!eq(Multicast, 1), MulticastMod, "") #
+                      ".b128 [$addr], [$mbar];",
+                      [(Intrinsic Int64Regs:$addr, Int64Regs:$mbar)]>, Requires<Pred>;
+
+  def _shared: NVPTXInst<(outs), (ins Int64Regs:$addr, Int64Regs:$mbar),
+                         "clusterlaunchcontrol.try_cancel.async.shared::cta.mbarrier::complete_tx::bytes" #
+                         !if(!eq(Multicast, 1), MulticastMod, "") #
+                         ".b128 [$addr], [$mbar];",
+                         [(IntrinsicShared Int64Regs:$addr, Int64Regs:$mbar)]>, Requires<Pred>;
+
+  def _shared32: NVPTXInst<(outs), (ins Int32Regs:$addr, Int32Regs:$mbar),
+                           "clusterlaunchcontrol.try_cancel.async.shared::cta.mbarrier::complete_tx::bytes" #
+                           !if(!eq(Multicast, 1), MulticastMod, "") #
+                           ".b128 [$addr], [$mbar];",
----------------
AlexMaclean wrote:

You can use the `addr` pattern and `ADDR` operand to automatically get address folding and avoid the need for 32 and 64 bit variants. 

https://github.com/llvm/llvm-project/pull/134568


More information about the llvm-commits mailing list