[llvm] [amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (PR #105822)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 27 11:56:43 PDT 2024


================
@@ -208,6 +208,20 @@ def int_amdgcn_init_exec_from_input : Intrinsic<[],
   [IntrConvergent, IntrHasSideEffects, IntrNoMem, IntrNoCallback,
    IntrNoFree, IntrWillReturn, ImmArg<ArgIndex<1>>]>;
 
+// Sets the function into whole-wave-mode and returns whether the lane was
+// active when entering the function. A branch depending on this return will
+// revert the EXEC mask to what it was when entering the function, thus
+// resulting in a no-op. This pattern is used to optimize branches when function
+// tails need to be run in whole-wave-mode. It may also have other consequences
+// (mostly related to WWM CSR handling) that differentiate it from using
+// a plain `amdgcn.init.exec -1`.
+//
+// Can only be used in functions with the `amdgpu_cs_chain` calling convention.
+// Using this intrinsic without immediately branching on its return value is an
+// error.
+def int_amdgcn_init_whole_wave : Intrinsic<[llvm_i1_ty], [], [
+    IntrHasSideEffects, IntrNoMem, IntrNoDuplicate, IntrConvergent]>;
----------------
arsenm wrote:

Does this really need IntrNoDuplicate, I would like to eliminate it

https://github.com/llvm/llvm-project/pull/105822


More information about the llvm-commits mailing list