[llvm] [amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (PR #105822)

Mon Sep 2 02:09:35 PDT 2024

================
@@ -208,6 +208,20 @@ def int_amdgcn_init_exec_from_input : Intrinsic<[],
   [IntrConvergent, IntrHasSideEffects, IntrNoMem, IntrNoCallback,
    IntrNoFree, IntrWillReturn, ImmArg<ArgIndex<1>>]>;
 
+// Sets the function into whole-wave-mode and returns whether the lane was
+// active when entering the function. A branch depending on this return will
----------------
nhaehnle wrote:

It would be pretty cool to support something like this for arbitrary functions. However, the semantics of it would have to be different.

For chain functions, we explicitly plan to rely on the fact that function arguments in the inactive lanes have meaningful data in them, and the function that uses this intrinsic is expected to be aware of this and in fact to make use of this data. In other words, there's no need for the backend to save/restore the contents of inactive lanes. That would not be true for default calling convention functions.

I have long thought that we should rip out the current whole wave mode implementation in favor of something that works more like an MLIR region (represented as a function to be called). However, something needs to replace the functionality of set.inactive. So whatever we end up with in the end, it could well be somewhat similar, but it's different enough that we should keep it separate from this PR.

https://github.com/llvm/llvm-project/pull/105822