[llvm] [LLVM][NVPTX] Add NVPTX codegen support for clusterlaunchcontrol instruction (PR #134568)
Durgadoss R via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 7 00:46:33 PDT 2025
================
@@ -7301,3 +7301,66 @@ def INT_NVVM_ST_BULK_SHARED_CTA:
"st.bulk.shared::cta [$dest_addr], $size, 0;",
[(int_nvvm_st_bulk_shared_cta addr:$dest_addr, i64:$size, (i64 0))]>,
Requires<[hasSM<100>, hasPTX<86>]>;
+
+// clusterlaunchcontrol Instructions
+
+multiclass CLUSTERLAUNCHCONTROL_TRY_CANCEL<Intrinsic Intrin, bit Multicast, list<Predicate> Pred> {
+ defvar Intrinsic = !cast<Intrinsic>(Intrin # !if(!eq(Multicast, 1), "_multicast", ""));
+ defvar IntrinsicShared = !cast<Intrinsic>(Intrin # !if(!eq(Multicast, 1), "_multicast", "") # "_shared");
+
+ def _gen: NVPTXInst<(outs), (ins Int64Regs:$addr, Int64Regs:$mbar),
+ "clusterlaunchcontrol.try_cancel.async.mbarrier::complete_tx::bytes" #
+ !if(!eq(Multicast, 1), ".multicast::cluster::all", "") #
+ ".b128 [$addr], [$mbar];",
+ [(Intrinsic Int64Regs:$addr, Int64Regs:$mbar)]>, Requires<Pred>;
+
+ def _shared: NVPTXInst<(outs), (ins Int64Regs:$addr, Int64Regs:$mbar),
+ "clusterlaunchcontrol.try_cancel.async.shared::cta.mbarrier::complete_tx::bytes" #
----------------
durga4github wrote:
We could do the same thing for the prefix, too, but it has the `shared::cta` as an extra modifier. So, I would leave it to your choice.
https://github.com/llvm/llvm-project/pull/134568
More information about the llvm-commits
mailing list