[PATCH] D133723: [AMDGPU][GFX11] Use VGPR_32_F128 for VOP1,2,C

Wed Sep 14 07:29:15 PDT 2022

Joe_Nash marked an inline comment as done.
Joe_Nash added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:3279-3281
+  assert(Opc != AMDGPU::V_FMAC_F16_T16_e32 &&
+         "V_FMAC_F16_T16_e32 is not supported and not expected to be present "
+         "pre-RA");
----------------
foad wrote:
> Joe_Nash wrote:
> > foad wrote:
> > > My understanding is that having a _T16_e32 instruction here should work correctly, but it is undesirable because it has more restrictive register classes than the _T16_e64 form. So perhaps we want a more general assertion in (or just before) the register allocator that there are no _T16_e32 instructions present? I'm not sure how or where to implement that.
> > From the perspective of the RA, allocating a _T16_e32 will work correctly. 
> > 
> > In this particular function, there is a restriction. V_FMAC_F16_T16_e32 -> V_FMA_F16_gfx9_e64 should not be allowed, because the register class in the former is VGPR_32_F128 and in the latter is VGPR_32. So it would require a COPY or something, which we are not doing. 
> Is a COPY needed in that situation? I thought perhaps the register allocator would just handle it, if the register classes overlapped. But I'm not sure. Anyway I guess this assert is fine for now at least.
The MIR verifier will fail immediately after the twoaddressinstruction pass if you make an instruction like that.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133723/new/

https://reviews.llvm.org/D133723