[llvm] [AMDGPU] Fix Inefficient S_CSELECT_B64 Sequence (PR #167780)

Fri Nov 14 03:08:16 PST 2025

https://github.com/jayfoad requested changes to this pull request.

> s_cselect_b64 s[0:1], X!=0, 0
> v_cndmask_b32_e64 v0, Z, Y, s[0:1]
> v_readfirstlane_b32 s0, v0
>
> To:
>
> s_cselect_b32 s0, Y, Z

That's not safe unless you know that the bit in X corresponding to the first active lane is 1. Much easier to only do this if X==-1.

https://github.com/llvm/llvm-project/pull/167780