[PATCH] D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers.

krishna chaitanya sankisa via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 8 00:46:15 PST 2022


skc7 added a comment.

In D134423#3912577 <https://reviews.llvm.org/D134423#3912577>, @alex-t wrote:

> In D134423#3911947 <https://reviews.llvm.org/D134423#3911947>, @foad wrote:
>
>> In D134423#3911726 <https://reviews.llvm.org/D134423#3911726>, @skc7 wrote:
>>
>>> %8:sreg_32 = COPY %5:vgpr_32 
>>> %7:vgpr_32 = BUFFER_LOAD_DWORD_OFFEN %4:vgpr_32, killed %6:sgpr_128, **%8:sreg_32**, 0, 0, 0, 0, implicit $exec ::
>>
>> I need more context. Is %5 uniform?
>
> I think that I've got an idea behind this patch. Let's say %5 is uniform. Then we've got to try to promote all the %8 descendants to SALU if possible.
> In some cases, it appears that such a copy has few or even no SALU descendants, and according to the common algorithm should be converted to VALU.
> When the conversion is done, legalizeOperands creates the waterfall loop which is obviously much worse than inserting the v_readfirstlane_b32.
> As far as I understand, @skc7 addresses this scenario and aims to avoid an unnecessary waterfall loop.
> BTW, if %5 is divergent we have a bug in ISel. We now should not have any V2S copy with the divergent source.

@alex-t @arsenm @foad This patch has been put up to fix the vgpr2sgpr copy instruction whose result is used as vgpr for srsrc/soffset of MUBUF/MTBUF. This is to fix the issue "Illegal virtual register for instruction". Shall I move to the initial version of the patch which just deals with vgpr2sgpr copy instruction? Legalizing the operands and coming to generic solution to fix such issues is beyond the scope of the patch as far as I understand.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134423/new/

https://reviews.llvm.org/D134423



More information about the llvm-commits mailing list