[PATCH] D133723: [AMDGPU][GFX11] Use VGPR_32_F128 for VOP1,2,C

Wed Sep 14 14:27:54 PDT 2022

Joe_Nash marked 4 inline comments as done.
Joe_Nash added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:928
+    (f32 (f16_to_fp i32:$src0)),
+    (cvt32to16_e64 SRCMODS.NONE, $src0)
+  >;
----------------
dp wrote:
> The names `cvt16to32_e64` and `cvt32to16_e64` are misleading. They should be the other way around.
Thanks, I have fixed the names.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:987
+let SubtargetPredicate = HasTrue16BitInsts in
+defm : f16_fp_Pats<V_CVT_F16_F32_T16_e64, V_CVT_F32_F16_T16_e64>;

----------------
dp wrote:
> Why are these patterns special? `_E64` variants have no VGPR limitations. Is this required for future changes?
This is simply to account for the fact that we have new pseudo instructions for each instruction with 16 bit operands on GFX11. Therefore if a pattern directly directly refers to an instruction we need to duplicate it, one for each pseudo. 

================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.td:557
+
+def VGPR_32_F128 : SIRegisterClass<"AMDGPU", !listconcat(Reg32Types.types, Reg16Types.types), 32,
+                            (add (sequence "VGPR%u", 0, 127))> {
----------------
rampitec wrote:
> arsenm wrote:
> > Joe_Nash wrote:
> > > arsenm wrote:
> > > > I don't know what "_F128" is supposed to mean. I read this as a class for long double
> > > It is short for First 128. Is a set with only the first 128 VGPRs. I will add a comment noting this. 
> > Lo128 would probably be more consistent terminology over "first"
> I probably agree to that. F128 also hints a long double to me.
> With that I still hope this class is transitional and will be replaced by a real 16 bit RC.
Ok, replaced _F128 with _Lo128. 
Yes, this is a transitional class.

================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:928
 defm V_CVT_PK_U16_F32      : VOP3_Realtriple_gfx11<0x307>;
-defm V_MAX_U16             : VOP3Only_Realtriple_gfx11<0x309>;
-defm V_MAX_I16             : VOP3Only_Realtriple_gfx11<0x30a>;
-defm V_MIN_U16             : VOP3Only_Realtriple_gfx11<0x30b>;
-defm V_MIN_I16             : VOP3Only_Realtriple_gfx11<0x30c>;
+defm V_MAX_U16_T16         : VOP3Only_Realtriple_T16_gfx11<0x309, "v_max_u16">;
+defm V_MAX_I16_T16         : VOP3Only_Realtriple_T16_gfx11<0x30a, "v_max_i16">;
----------------
arsenm wrote:
> Since the _T16 doesn't appear in the instruction mnemonic it should be lowercased
replaced _T16 with _t16 everywhere

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133723/new/

https://reviews.llvm.org/D133723