[PATCH] D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers.
krishna chaitanya sankisa via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 7 09:12:33 PST 2022
skc7 added a comment.
In D134423#3911947 <https://reviews.llvm.org/D134423#3911947>, @foad wrote:
> In D134423#3911726 <https://reviews.llvm.org/D134423#3911726>, @skc7 wrote:
>
>> %8:sreg_32 = COPY %5:vgpr_32
>> %7:vgpr_32 = BUFFER_LOAD_DWORD_OFFEN %4:vgpr_32, killed %6:sgpr_128, **%8:sreg_32**, 0, 0, 0, 0, implicit $exec ::
>
> I need more context. Is %5 uniform?
define <4 x i32> @extract0_bitcast_raw_buffer_load_v4i32(<4 x i32> inreg %rsrc, i32 %ofs, i32 %sofs) local_unnamed_addr #0 {
%var = tail call <4 x i32> @llvm.amdgcn.raw.buffer.load.v4i32(<4 x i32> %rsrc, i32 %ofs, i32 %sofs, i32 0)
ret <4 x i32> %var
}
IR dump after amdgpu-isel:
bb.0 (%ir-block.0):
liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5
%5:vgpr_32 = COPY $vgpr5
%4:vgpr_32 = COPY $vgpr4
%3:vgpr_32 = COPY $vgpr3
%2:vgpr_32 = COPY $vgpr2
%1:vgpr_32 = COPY $vgpr1
%0:vgpr_32 = COPY $vgpr0
%6:sgpr_128 = REG_SEQUENCE %0:vgpr_32, %subreg.sub0, %1:vgpr_32, %subreg.sub1, %2:vgpr_32, %subreg.sub2, %3:vgpr_32, %subreg.sub3
%8:sreg_32 = COPY %5:vgpr_32
%7:vreg_128 = BUFFER_LOAD_DWORDX4_OFFEN %4:vgpr_32, killed %6:sgpr_128, %8:sreg_32, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s128), align 1, addrspace 4)
%9:vgpr_32 = COPY %7.sub0:vreg_128
%10:vgpr_32 = COPY %7.sub1:vreg_128
%11:vgpr_32 = COPY %7.sub2:vreg_128
%12:vgpr_32 = COPY %7.sub3:vreg_128
$vgpr0 = COPY %9:vgpr_32
$vgpr1 = COPY %10:vgpr_32
$vgpr2 = COPY %11:vgpr_32
$vgpr3 = COPY %12:vgpr_32
SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134423/new/
https://reviews.llvm.org/D134423
More information about the llvm-commits
mailing list