[PATCH] D116270: [AMDGPU] Enable divergence-driven XNOR selection
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 6 11:05:27 PST 2022
rampitec added a comment.
In D116270#3224818 <https://reviews.llvm.org/D116270#3224818>, @alex-t wrote:
> In D116270#3215217 <https://reviews.llvm.org/D116270#3215217>, @rampitec wrote:
>
>> In D116270#3209307 <https://reviews.llvm.org/D116270#3209307>, @alex-t wrote:
>>
>>> This looks like a regression in xnor.ll :
>>>
>>> s_not_b32 s0, s0 v_not_b32_e32 v0, v0
>>> v_xor_b32_e32 v0, s0, v0 v_xor_b32_e32 v0, s4, v0
>>>
>>> but it is not really. All the nodes in the example are divergent and the divergent ( xor, x -1) is selected to V_NOT_B32 as of https://reviews.llvm.org/D115884 has been committed.
>>> S_NOT_B32 appears at the left because of the custom optimization that converts S_XNOR_B32 back to NOT (XOR) for the targets which have no V_XNOR. This optimization relies on the fact that if the NOT operand is SGPR and V_XOR_B32_e32 can accept SGPR as a first source operand.
>>> I am not sure if it is always safe. The VALU instructions execution is controlled by the EXEC mask but SALU is not.
>>
>> This is indeed a regression. It is always safe to keep s_not_b32 on SALU. Also note this effectively makes `SIInstrInfo::lowerScalarXnor()` useless. This is why XNOR was left behind by the D111907 <https://reviews.llvm.org/D111907>.
>
> `SIInstrInfo::lowerScalarXnor()` is exactly the part of the "manual" SALU to VALU lowering that I am trying to get rid of.
> The divergent "not" must be selected to the "V_NOT_B32_e32/64" otherwise we still have illegal VGPR to SGPR copies.
> This happens because the divergent "not" node has divergent operands and their result will be likely in VGPR.
> Also, we should select everything correctly first and can apply some peephole optimizations after.
> In other words: we should not "cheat ourselves" during the selection. The selection should be done fairly corresponding to the node divergence bit.
> Then we can apply the optimization in case it is safe.
> Note that this is not the only case when we would like to further optimize the code after selection.
> I'm planning to further add a separate pass for that.
>
> We cannot solve the problem in the custom selection procedure because NOT node operand has not yet been selected and we do not know if it is SGPR or VGPR.
> The only way, for now, is to post-process not(xor)/xor(not) in SIFixSGPRCopies. This may be considered a temporary hack until we have no proper pass for that.
`SIInstrInfo::lowerScalarXnor()` is dead after your patch and thus the patch has to remove it.
Then this is a clear regression, so if this requires a separate peephole later we need that peephole first and make sure the test does not regress.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116270/new/
https://reviews.llvm.org/D116270
More information about the llvm-commits
mailing list