[llvm] [AMDGPU] Allow hoisting of V_READFIRSTLANE_B32 for uniform operand (PR #178312)
Sameer Sahasrabuddhe via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 11 05:51:51 PST 2026
ssahasra wrote:
> I am bit confused about the purpose of this hoisting. Can you please explain a bit what we are trying to achieve with the hoisting?
Yeah, I am back to that fundamental question too.
Whatever the goal is, the current implementation as shown in this PR is not okay. The convergent and noconvergent properties have a very clear definition based on "the set of threads that execute this operation". Until there is a clear demonstration of how "input to this readfirstlane is uniform" translates to "this readfirstlane is noconvergent", there's really no point discussing the current implementation.
We should be looking at other ways to do this, and my bet is on a peephole optimization that has specific knowledge of each intrinsic.
https://github.com/llvm/llvm-project/pull/178312
More information about the llvm-commits
mailing list