[PATCH] D25975: AMDGPU/SI: Make f16 a legal type for VI subtargets

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 26 09:52:43 PDT 2016


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:82-83
 
   // TODO: Subtarget feature for i16
-  if (Subtarget->getGeneration() >= AMDGPUSubtarget::VOLCANIC_ISLANDS)
+  if (Subtarget->getGeneration() >= AMDGPUSubtarget::VOLCANIC_ISLANDS) {
     addRegisterClass(MVT::i16, &AMDGPU::SReg_32RegClass);
----------------
The TODO was taken care of already


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:85-86
     addRegisterClass(MVT::i16, &AMDGPU::SReg_32RegClass);
+    addRegisterClass(MVT::f16, &AMDGPU::SReg_32RegClass);
+    addRegisterClass(MVT::f16, &AMDGPU::VGPR_32RegClass);
+  }
----------------
tstellarAMD wrote:
> There is a 1-to-1 mapping between types and register classes, so you can drop the first addRegisterClass(MVT::f16, ... ) call
I think this should only be added to SReg_32. We already have other random inconsistencies from having f32 in a VGPR class and i32 in SGPR which I've been trying to fix


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3602
 
-  if (VT != MVT::f32 && VT != MVT::f64)
+  if (VT != MVT::f16 && VT != MVT::f32 && VT != MVT::f64)
     return SDValue();
----------------
Does this need an f16 is legal check? If not this can probably just be an !isVector check, any of the other FP types are unusable 


================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:1415-1416
 
-  if (Opc == AMDGPU::V_MAD_F32 || Opc == AMDGPU::V_MAC_F32_e64) {
+  if (Opc == AMDGPU::V_MAD_F16 || Opc == AMDGPU::V_MAC_F16_e64 ||
+      Opc == AMDGPU::V_MAD_F32 || Opc == AMDGPU::V_MAC_F32_e64) {
+    bool IsF16 = Opc == AMDGPU::V_MAD_F16 || Opc == AMDGPU::V_MAC_F16_e64;
----------------
I would prefer checking the more common f32 cases first


================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:1505
 
-      if (Opc == AMDGPU::V_MAC_F32_e64) {
+      if (Opc == AMDGPU::V_MAC_F16_e64 || Opc == AMDGPU::V_MAC_F32_e64) {
         UseMI.untieRegOperand(
----------------
Ditto


================
Comment at: lib/Target/AMDGPU/SIInstructions.td:443-445
+  (f16 (fmad (VOP3NoMods0 f32:$src0, i32:$src0_modifiers, i1:$clamp, i32:$omod),
+             (VOP3NoMods f32:$src1, i32:$src1_modifiers),
+             (VOP3NoMods f32:$src2, i32:$src2_modifiers))),
----------------
The f32 patterns on the sources look wrong. Not sure how this compiles 


https://reviews.llvm.org/D25975





More information about the llvm-commits mailing list