[PATCH] D156105: [AMDGPU][True16] Support generating differently-sized register transfers.

Thu Jul 27 13:22:55 PDT 2023

Joe_Nash added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:740-744
+        MCRegister &RegToFix = (Size == 16) ? DestReg : SrcReg;
+        MCRegister Super = RI.get32BitRegister(RegToFix);
+        assert(RI.getSubReg(Super, AMDGPU::lo16) == RegToFix ||
+               RI.getSubReg(Super, AMDGPU::hi16) == RegToFix);
+        RegToFix = Super;
----------------
kosarev wrote:
> arsenm wrote:
> > Is this actually reachable? I forget how exactly we ended up with this partial 16-bit register thing
> I'm going to try to understand more about what's going on here, but maybe @Joe_Nash already knows the answer as he was working on that bit.
It is covered by lo16-32bit-physreg-copy.mir.

In theory it is reachable on any target with 16 bit instructions (excluding GF11 with True 16 bit instructions). This code converts
COPY %1:vgpr_32 = %2:vgpr_lo16 or COPY %1:vgpr_lo16 = %2:vgpr_32
into v_mov_b32 on those targets.
On GFX11 it converts to v_mov_b16

Now as to whether the code sequence in that mir test (or any use of a VGPR_LO16 register) is produced in any legitimate shader, I don't think so. That probably needs to be verified empirically on a test corpus, but then this code and I think VGPR_LO16/ VGPR_HI16 can be removed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156105/new/

https://reviews.llvm.org/D156105