[llvm] [NVPTX] Add 2-CTA mode support to TMA G2S intrinsics (PR #143178)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 6 11:44:13 PDT 2025
================
@@ -1034,18 +1034,22 @@ source tensor is preserved at the destination. The dimension of the
tensor data ranges from 1d to 5d with the coordinates specified
by the ``i32 %d0 ... i32 %d4`` arguments.
-* The last two arguments to these intrinsics are boolean flags
- indicating support for cache_hint and/or multicast modifiers.
- These flag arguments must be compile-time constants. The backend
- looks through these flags and lowers the intrinsics appropriately.
+* The last three arguments to these intrinsics are boolean flags
+ indicating support for multicast, cache_hint and cta_group::2
+ modifiers. These flag arguments must be compile-time constants.
+ The backend looks through these flags and lowers the intrinsics
+ appropriately.
----------------
Artem-B wrote:
I think we're growing too many bool arguments.
It's fine when we have one. At least then we know what that bool is for.
With two it gets tricky -- which of the bools controls what?
With three it starts to get into the "I have to look at the docs every time I look at the call site".
I think we should consider shifting the balance towards encoding such flags in the intrinsic names, when they reflect specific functionality. E.g multicast would be a sensible part of the intrinsic name.
Another option for the `cta_group::*` parameter, is to make it mandatory, accepting 0/1/2/3 values, with 0 meaning no cta_group, 1/2 specifying given variants, and 3 being reserved, defaulting to no cta_group, or triggering an error.
https://github.com/llvm/llvm-project/pull/143178
More information about the llvm-commits
mailing list