[PATCH] D25975: AMDGPU/SI: Make f16 a legal type for VI subtargets
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 8 14:20:43 PST 2016
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:2041-2044
+ if (Op.getOperand(0).getValueType() == MVT::i64)
+ return Op.getOpcode() == ISD::SINT_TO_FP ?
+ AMDGPUTargetLowering::LowerSINT_TO_FP(Op, DAG) :
+ AMDGPUTargetLowering::LowerUINT_TO_FP(Op, DAG);
----------------
Braces
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3768
// not support denormals ever.
- if (VT == MVT::f32 &&
- !Subtarget->hasFP32Denormals()) {
+ if (VT == MVT::f32 && !Subtarget->hasFP32Denormals()) {
SDValue LHS = N->getOperand(0);
----------------
Should this also handle f16? Could be separate optimization patch
================
Comment at: lib/Target/AMDGPU/SIISelLowering.h:51
+ /// floating point type \p VT, by either extending or truncating it.
+ SDValue GetFPExtOrFPTrunc(SelectionDAG &DAG,
+ SDValue Op,
----------------
should start with lower case
================
Comment at: lib/Target/AMDGPU/SIInstructions.td:422-423
+def : Pat <
+ (f16 (fpround f64:$src)),
+ (V_CVT_F16_F32_e32 (V_CVT_F32_F64_e32 $src))
+>;
----------------
Is this (and fpextend) correct? A correct lowering for these was just added I thought
================
Comment at: lib/Target/AMDGPU/SIRegisterInfo.td:254
// See comments in SIInstructions.td for more info.
def SReg_32_XM0 : RegisterClass<"AMDGPU", [i32, f32], 32,
(add SGPR_32, VCC_LO, VCC_HI, EXEC_LO, EXEC_HI, FLAT_SCR_LO, FLAT_SCR_HI,
----------------
Missing i16/f16 but it probably doesn't matter
================
Comment at: lib/Target/AMDGPU/SISchedule.td:29
// Vector ALU instructions
+def Write16Bit : SchedWrite;
def Write32Bit : SchedWrite;
----------------
I don't think we need this since they run at the same rat as 32-bit
================
Comment at: lib/Target/AMDGPU/VOP2Instructions.td:326
+
+// FIXME: V_LDEXP_F16 requires a change to llvm.amdgcn.ldexp intrinsic.
+defm V_LDEXP_F16 : VOP2Inst <"v_ldexp_f16", VOP_F16_F16_I16/*, AMDGPUldexp*/>;
----------------
Why is this necessary? It can already mangle the FP type. This should work if you change it to VOP_F16_F16_I32, the int type doesn't matter
https://reviews.llvm.org/D25975
More information about the llvm-commits
mailing list