[llvm] [LLVM][NVPTX] Add NVPTX codegen support for clusterlaunchcontrol instruction (PR #134568)

Durgadoss R via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 7 00:46:33 PDT 2025


================
@@ -7301,3 +7301,66 @@ def INT_NVVM_ST_BULK_SHARED_CTA:
             "st.bulk.shared::cta [$dest_addr], $size, 0;",
             [(int_nvvm_st_bulk_shared_cta addr:$dest_addr, i64:$size, (i64 0))]>,
             Requires<[hasSM<100>, hasPTX<86>]>;
+
+// clusterlaunchcontrol Instructions
+
+multiclass CLUSTERLAUNCHCONTROL_TRY_CANCEL<Intrinsic Intrin, bit Multicast, list<Predicate> Pred> {
+  defvar Intrinsic = !cast<Intrinsic>(Intrin # !if(!eq(Multicast, 1), "_multicast", ""));
+  defvar IntrinsicShared = !cast<Intrinsic>(Intrin # !if(!eq(Multicast, 1), "_multicast", "") # "_shared");
+
+  def _gen: NVPTXInst<(outs), (ins Int64Regs:$addr, Int64Regs:$mbar),
+                      "clusterlaunchcontrol.try_cancel.async.mbarrier::complete_tx::bytes" #
+                      !if(!eq(Multicast, 1), ".multicast::cluster::all", "") #
+                      ".b128 [$addr], [$mbar];",
+                      [(Intrinsic Int64Regs:$addr, Int64Regs:$mbar)]>, Requires<Pred>;
+
+  def _shared: NVPTXInst<(outs), (ins Int64Regs:$addr, Int64Regs:$mbar),
+                         "clusterlaunchcontrol.try_cancel.async.shared::cta.mbarrier::complete_tx::bytes" #
----------------
durga4github wrote:

We could do the same thing for the prefix, too, but it has the `shared::cta` as an extra modifier. So, I would leave it to your choice.

https://github.com/llvm/llvm-project/pull/134568


More information about the llvm-commits mailing list