[llvm] [NVPTX] Add 2-CTA mode support to TMA G2S intrinsics (PR #143178)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 6 11:44:13 PDT 2025


================
@@ -1034,18 +1034,22 @@ source tensor is preserved at the destination. The dimension of the
 tensor data ranges from 1d to 5d with the coordinates specified
 by the ``i32 %d0 ... i32 %d4`` arguments.
 
-* The last two arguments to these intrinsics are boolean flags
-  indicating support for cache_hint and/or multicast modifiers.
-  These flag arguments must be compile-time constants. The backend
-  looks through these flags and lowers the intrinsics appropriately.
+* The last three arguments to these intrinsics are boolean flags
+  indicating support for multicast, cache_hint and cta_group::2
+  modifiers. These flag arguments must be compile-time constants.
+  The backend looks through these flags and lowers the intrinsics
+  appropriately.
----------------
Artem-B wrote:

I think we're growing too many bool arguments.
It's fine when we have one. At least then we know what that bool is for.
With two it gets tricky -- which of the bools controls what? 
With three it starts to get into the "I have to look at the docs every time I look at the call site".

I think we should consider shifting the balance towards encoding such flags in the intrinsic names, when they reflect specific functionality. E.g multicast would be a sensible part of the intrinsic name. 

Another option for the `cta_group::*` parameter, is to make it mandatory, accepting 0/1/2/3 values, with 0 meaning no cta_group, 1/2 specifying given variants, and 3 being reserved, defaulting to no cta_group, or triggering an error.


https://github.com/llvm/llvm-project/pull/143178


More information about the llvm-commits mailing list