[PATCH] D126475: [AMDGPU] gfx11 vop3 instructions
Dmitry Preobrazhensky via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 27 08:56:17 PDT 2022
dp added a comment.
Overall looks good.
This change also defines VOP1 and VOP2 opcodes so the name of the change is a bit misleading.
Also I noticed that there are instructions defined by this change that have no tests, for example: v_minmax*, v_maxmin*, v_div_scale*, v_sub_nc_i16, v_ashrrev_i64.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.td:1482
def VOP3PModsDOT : ComplexPattern<untyped, 2, "SelectVOP3PModsDOT">;
+def DotIUVOP3PMods : ComplexPattern<untyped, 1, "SelectDotIUVOP3PMods">;
----------------
Not used in this change, should go with VOP3P.
================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:816
+defm V_MAD_I64_I32_gfx11 : VOP3be_Real_gfx11<0x2ff, "V_MAD_I64_I32_gfx11", "v_mad_i64_i32">;
+// V_ADD_CO_U32, V_SUB_CO_U32, V_SUBREV_CO_U32 in VOP2Instructions.td
+defm V_CVT_PK_I16_F32 : VOP3_Realtriple_gfx11<0x306>;
----------------
What is so special with these opcodes that they are mentioned here? E.g. why v_add_f16 is not in this list?
================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:824
+defm V_CVT_PK_NORM_U16_F16 : VOP3_Realtriple_with_name_gfx11<0x313, "V_CVT_PKNORM_U16_F16" , "v_cvt_pk_norm_u16_f16" >;
+// LDEXP_F32, BFM_B32, BCNT, mbcntlo, mbcnt hi, cvt_pknorm_{i,u}16_f32, cvt_pk_u16_u32, cvt_pk_i16_i32 in VOP2
+defm V_SUB_NC_I32 : VOP3_Realtriple_with_name_gfx11<0x325, "V_SUB_I32", "v_sub_nc_i32">;
----------------
Ditto.
================
Comment at: llvm/lib/Target/AMDGPU/VOPInstructions.td:661
-class VOP_DPP_Pseudo <string OpName, VOPProfile P, list<dag> pattern=[]> :
- InstSI <P.OutsDPP, P.InsDPP, OpName#P.AsmDPP, pattern>,
+class VOP_DPP_Pseudo <string OpName, VOPProfile P, list<dag> pattern=[],
+ dag Ins = P.InsDPP, string asmOps = P.AsmDPP> :
----------------
This and the following DPP/DPP8 changes do not look relevant to this patch.
================
Comment at: llvm/lib/Target/AMDGPU/VOPInstructions.td:969
+}
+// In VOP1, we can have clamp and omod even if !HasModifiers
class getVOP3Pat<VOPProfile P, SDPatternOperator node> {
----------------
This sounds rather cryptic to me. Why VOP1 only?
================
Comment at: llvm/test/MC/AMDGPU/gfx11_vop123.s:2244
+
+v_dot2_f16_f16 v0, v1, v2, v3 op_sel:[0,0,1,1]
+// GFX11: encoding: [0x00,0x60,0x66,0xd6,0x01,0x05,0x0e,0x04]
----------------
Out of 6400 tests I found only 4 tests with op_sel. Is there a reason for that? Are you planning to switch to sp3 syntax?
================
Comment at: llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_all.txt:19694
+
+# GFX11: v_pk_add_f16 v0, v1, v2 ; encoding: [0x00,0x40,0x0f,0xcc,0x01,0x05,0x02,0x18]
+0x00,0x40,0x0f,0xcc,0x01,0x05,0x02,0x18
----------------
This and the following tests are for VOP3P and should be moved elsewhere.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D126475/new/
https://reviews.llvm.org/D126475
More information about the llvm-commits
mailing list