[llvm] [AMDGPU] Fix Inefficient S_CSELECT_B64 Sequence (PR #167780)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 14 03:08:16 PST 2025
https://github.com/jayfoad requested changes to this pull request.
> s_cselect_b64 s[0:1], X!=0, 0
> v_cndmask_b32_e64 v0, Z, Y, s[0:1]
> v_readfirstlane_b32 s0, v0
>
> To:
>
> s_cselect_b32 s0, Y, Z
That's not safe unless you know that the bit in X corresponding to the first active lane is 1. Much easier to only do this if X==-1.
https://github.com/llvm/llvm-project/pull/167780
More information about the llvm-commits
mailing list