[llvm] [amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (PR #105822)
Diana Picus via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 9 06:01:02 PDT 2024
rovka wrote:
> I don't think we can simply remove the `init.whole.wave` in the entry block.
> [...]
> The point is when have an instruction in tail block which want to operate on a function argument, I think the expectation is the `V_ADD_I32` here should operate(read/write) all the lanes. If we remove the exec setup in the entry block, the instruction would not be able to see the values in the lanes that were inactive at function start. Does this make sense?
Yes, that makes sense, and that's why I was introducing the V_SET_INACTIVE in the tail block.
Anyway, the new version is following Nicolai's suggestion to leave the branch optimization for a future patch. So for now the EXEC setup is still there and we can discuss what to do with the inactive lanes in a future patch :)
https://github.com/llvm/llvm-project/pull/105822
More information about the llvm-commits
mailing list