[PATCH] D78829: [AMDGPU] Make SREG_LO16 legal

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 24 16:49:05 PDT 2020


arsenm added a comment.

In D78829#2002946 <https://reviews.llvm.org/D78829#2002946>, @rampitec wrote:

> In D78829#2002912 <https://reviews.llvm.org/D78829#2002912>, @arsenm wrote:
>
> > Could we leave all instructions as 32-bit defs, but then have a 16-bit subreg copy as the only use?
> >
> > %0:vgpr_32 = V_FOO_U16
> >  %1:vgpr_16lo = COPY %0.sub16_lo
> >  ....
> >
> > The register allocator would understand this and fold out the copy
>
>
> So you want to keep SReg_32 as an RC for i16 and f16? And add a copy to each instruction producing a 16 bit value?
>
> I am afraid it will not work. Instruction needs to produce i16 for which legal class is SReg_32, we will return SReg_lo16. Selector will either complain, not match or emit yet another copy. The problem is a 16 bit value needs to leave either in a 32 bit or a 16 bit RC, but not both.


This would probably require new support in the InstrEmitter or patterns but I don't think is impossible



================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:544-547
+    }
+
+    RC = RI.getPhysRegClass(DestReg);
+  }
----------------
rampitec wrote:
> arsenm wrote:
> > We should make it so that MachineCopyPropagation can deal with this case
> Hm... So you want it always? I thought this is a red flag.
I mean MCP should eliminate these identity copies. It shouldn't see the src == dest due to 16-bit obscuring this (at least for > -O0)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78829/new/

https://reviews.llvm.org/D78829





More information about the llvm-commits mailing list