[PATCH] D74937: [AMDGPU] Implement copyPhysReg for 16 bit subregs

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 20 17:47:39 PST 2020


rampitec added a comment.

In D74937#1885695 <https://reviews.llvm.org/D74937#1885695>, @arsenm wrote:

> In D74937#1885691 <https://reviews.llvm.org/D74937#1885691>, @rampitec wrote:
>
> > In D74937#1885689 <https://reviews.llvm.org/D74937#1885689>, @arsenm wrote:
> >
> > > In D74937#1885675 <https://reviews.llvm.org/D74937#1885675>, @rampitec wrote:
> > >
> > > > In D74937#1885659 <https://reviews.llvm.org/D74937#1885659>, @arsenm wrote:
> > > >
> > > > > I don't think copies of these should ever be produced (at leasts for the high half) since the high half is not really addressable, and only appears that way to some instructions. Where are copies coming from?
> > > >
> > > >
> > > > First, hi16 registers are used by load_hi instructions, that is their destination. And then RA can happily copy anything to anything. For sanity we need to know how to copy any register.
> > >
> > >
> > > The high result isn't what's encoded though, so they really are writing the 32-bit register. They only read the low 16-bits. I think the correct way to model this is a 32-bit write but only a 16-bit read
> >
> >
> > Low16 are preserved and if we say we write 32 bit then we cannot model it.
>
>
> I think declaring the high 16 is the output register is still wrong and not how it's encoded. Having only the 16-bit read is still an improvement


The testcase I showed in a parent revision only works if I define both, and for a reason: only so we can model independent subreg access.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74937/new/

https://reviews.llvm.org/D74937





More information about the llvm-commits mailing list