[PATCH] D78829: [AMDGPU] Make SREG_LO16 legal

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Apr 25 06:51:59 PDT 2020


arsenm added a comment.

In D78829#2003007 <https://reviews.llvm.org/D78829#2003007>, @arsenm wrote:

> In D78829#2002946 <https://reviews.llvm.org/D78829#2002946>, @rampitec wrote:
>
> > In D78829#2002912 <https://reviews.llvm.org/D78829#2002912>, @arsenm wrote:
> >
> > > Could we leave all instructions as 32-bit defs, but then have a 16-bit subreg copy as the only use?
> > >
> > > %0:vgpr_32 = V_FOO_U16
> > >  %1:vgpr_16lo = COPY %0.sub16_lo
> > >  ....
> > >
> > > The register allocator would understand this and fold out the copy
> >
> >
> > So you want to keep SReg_32 as an RC for i16 and f16? And add a copy to each instruction producing a 16 bit value?
> >
> > I am afraid it will not work. Instruction needs to produce i16 for which legal class is SReg_32, we will return SReg_lo16. Selector will either complain, not match or emit yet another copy. The problem is a 16 bit value needs to leave either in a 32 bit or a 16 bit RC, but not both.
>
>
> This would probably require new support in the InstrEmitter or patterns but I don't think is impossible


This is what it really needs to look like, otherwise it's lying about producing a 16-bit result. If we pretend it really only writes 16-bits, we can't make use of the different high bit handling behaviors


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78829/new/

https://reviews.llvm.org/D78829





More information about the llvm-commits mailing list