[PATCH] D116270: [AMDGPU] Enable divergence-driven XNOR selection
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 10 02:20:01 PST 2022
foad added a comment.
In D116270#3209307 <https://reviews.llvm.org/D116270#3209307>, @alex-t wrote:
> This looks like a regression in xnor.ll :
>
> s_not_b32 s0, s0 v_not_b32_e32 v0, v0
> v_xor_b32_e32 v0, s0, v0 v_xor_b32_e32 v0, s4, v0
>
> but it is not really. All the nodes in the example are divergent and the divergent ( xor, x -1) is selected to V_NOT_B32 as of https://reviews.llvm.org/D115884 has been committed.
> S_NOT_B32 appears at the left because of the custom optimization that converts S_XNOR_B32 back to NOT (XOR) for the targets which have no V_XNOR. This optimization relies on the fact that if the NOT operand is SGPR and V_XOR_B32_e32 can accept SGPR as a first source operand.
> I am not sure if it is always safe. The VALU instructions execution is controlled by the EXEC mask but SALU is not.
To repeat what I have already said elsewhere: this is **not** a correctness issue. This is just an optimization, where you can choose to calculate either `~s0 ^ v0` or `s0 ^ ~v0` (or `~(s0 ^ v0)`) and get exactly the same result. The optimization is to prefer the first form, because the intermediate result `~s0` is uniform, so you can keep it in an sgpr and not waste vgprs and valu instructions.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116270/new/
https://reviews.llvm.org/D116270
More information about the llvm-commits
mailing list