[llvm] [LLVM][NVPTX] Add NVPTX codegen support for clusterlaunchcontrol instruction (PR #134568)
    via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Mon Apr  7 12:16:15 PDT 2025
    
    
  
================
@@ -5381,4 +5381,50 @@ def int_nvvm_st_bulk_shared_cta : DefaultAttrsIntrinsic<[],
   [IntrArgMemOnly, IntrWriteMem,
     WriteOnly<ArgIndex<0>>, NoCapture<ArgIndex<0>>, ImmArg<ArgIndex<2>>]>;
 
+//
+// Cluster launch control
+//
+
+// clusterlaunchcontrol.try_cancel
+
+def int_nvvm_clusterlaunchcontrol_try_cancel_async
+    : Intrinsic<[], [llvm_ptr_ty, llvm_ptr_ty],
+                [IntrHasSideEffects, IntrArgMemOnly, NoCapture<ArgIndex<0>>, NoCapture<ArgIndex<1>>],
+                "llvm.nvvm.clusterlaunchcontrol.try_cancel.async">;
----------------
gonzalobg wrote:
In general, for all intrinsics that only accept "addresses to one state space and generic addresses to that one state space", LLVM should probably be doing what you propose ( @akshayrdeodhar for vis, since it sounds like something worth exploring related to better optimizations for intrinsics ).
Some intrinsics accepting generic addresses offload the address space detection + conversion to HW. 
If SW were to perform the detection + conversion, then that is pretty much always worse than doing it in HW.
For these, it may be good if we could express that a generic address, e.g., 
- can only narrow to some state spaces (e.g. either "global" or "shared").
- cannot narrow to some state spaces (e.g. "local").
Such that if address space narrowing proves that, e.g., a generic address does not point to e.g., "shared", then an intrinsic that only supports shared | global is called, this intrinsic further enables address space narrowing to narrow it further to global.
https://github.com/llvm/llvm-project/pull/134568
    
    
More information about the llvm-commits
mailing list