[PATCH] D78829: [AMDGPU] Make SREG_LO16 legal

Sat Apr 25 09:32:58 PDT 2020

rampitec added a comment.

In D78829#2003535 <https://reviews.llvm.org/D78829#2003535>, @arsenm wrote:

> In D78829#2003007 <https://reviews.llvm.org/D78829#2003007>, @arsenm wrote:
>
> > In D78829#2002946 <https://reviews.llvm.org/D78829#2002946>, @rampitec wrote:
> >
> > > In D78829#2002912 <https://reviews.llvm.org/D78829#2002912>, @arsenm wrote:
> > >
> > > > Could we leave all instructions as 32-bit defs, but then have a 16-bit subreg copy as the only use?
> > > >
> > > > %0:vgpr_32 = V_FOO_U16
> > > >  %1:vgpr_16lo = COPY %0.sub16_lo
> > > >  ....
> > > >
> > > > The register allocator would understand this and fold out the copy
> > >
> > >
> > > So you want to keep SReg_32 as an RC for i16 and f16? And add a copy to each instruction producing a 16 bit value?
> > >
> > > I am afraid it will not work. Instruction needs to produce i16 for which legal class is SReg_32, we will return SReg_lo16. Selector will either complain, not match or emit yet another copy. The problem is a 16 bit value needs to leave either in a 32 bit or a 16 bit RC, but not both.
> >
> >
> > This would probably require new support in the InstrEmitter or patterns but I don't think is impossible
>
>
> This is what it really needs to look like, otherwise it's lying about producing a 16-bit result. If we pretend it really only writes 16-bits, we can't make use of the different high bit handling behaviors

Hm... That's true. Source operands must be 16 bit and destination 32.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78829/new/

https://reviews.llvm.org/D78829