[llvm] [amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (PR #105822)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 27 11:56:43 PDT 2024
================
@@ -208,6 +208,20 @@ def int_amdgcn_init_exec_from_input : Intrinsic<[],
[IntrConvergent, IntrHasSideEffects, IntrNoMem, IntrNoCallback,
IntrNoFree, IntrWillReturn, ImmArg<ArgIndex<1>>]>;
+// Sets the function into whole-wave-mode and returns whether the lane was
+// active when entering the function. A branch depending on this return will
+// revert the EXEC mask to what it was when entering the function, thus
+// resulting in a no-op. This pattern is used to optimize branches when function
+// tails need to be run in whole-wave-mode. It may also have other consequences
+// (mostly related to WWM CSR handling) that differentiate it from using
+// a plain `amdgcn.init.exec -1`.
+//
+// Can only be used in functions with the `amdgpu_cs_chain` calling convention.
+// Using this intrinsic without immediately branching on its return value is an
+// error.
+def int_amdgcn_init_whole_wave : Intrinsic<[llvm_i1_ty], [], [
+ IntrHasSideEffects, IntrNoMem, IntrNoDuplicate, IntrConvergent]>;
----------------
arsenm wrote:
Does this really need IntrNoDuplicate, I would like to eliminate it
https://github.com/llvm/llvm-project/pull/105822
More information about the llvm-commits
mailing list