[llvm] [LLVM][NVPTX]: Add intrinsic for setmaxnreg (PR #77289)

Durgadoss R via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 8 06:07:48 PST 2024


================
@@ -4710,4 +4710,10 @@ def int_nvvm_is_explicit_cluster
               [IntrNoMem, IntrSpeculatable, NoUndef<RetIndex>],
               "llvm.nvvm.is_explicit_cluster">;
 
+// Setmaxnreg intrinsic
+def int_nvvm_setmaxnreg_sync_aligned_u32
+  : DefaultAttrsIntrinsic<[], [llvm_i32_ty, llvm_i32_ty],
+              [IntrConvergent, IntrNoMem, IntrHasSideEffects, ImmArg<ArgIndex<1>>],
----------------
durga4github wrote:

The PTX instruction has the '.aligned' semantics i.e. all threads must execute the same setmaxnreg instruction.

Quoting from the PTX spec:
--
"The mandatory .aligned qualifier indicates that all threads in the warpgroup must execute the same setmaxnreg instruction. In conditionally executed code, setmaxnreg instruction should only be used if it is known that all threads in warpgroup evaluate the condition identically, otherwise the behavior is undefined."

https://github.com/llvm/llvm-project/pull/77289


More information about the llvm-commits mailing list