[PATCH] D25975: AMDGPU/SI: Make f16 a legal type for VI subtargets

Wed Nov 9 06:54:50 PST 2016

kzhuravl added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3768
     // not support denormals ever.
-    if (VT == MVT::f32 &&
-        !Subtarget->hasFP32Denormals()) {
+    if (VT == MVT::f32 && !Subtarget->hasFP32Denormals()) {
       SDValue LHS = N->getOperand(0);
----------------
arsenm wrote:
> Should this also handle f16? Could be separate optimization patch
Yes, I was initially planning to do it in this patch, but then decided that separate patch would be better.

================
Comment at: lib/Target/AMDGPU/SIInstructions.td:422-423
+def : Pat <
+  (f16 (fpround f64:$src)),
+  (V_CVT_F16_F32_e32 (V_CVT_F32_F64_e32 $src))
+>;
----------------
arsenm wrote:
> Is this (and fpextend) correct? A correct lowering for these was just added I thought
I think so. Without these changes a few existing tests (i.e. fptrunc.ll) were failing with the "cannot select error".

================
Comment at: lib/Target/AMDGPU/VOP2Instructions.td:326
+
+// FIXME: V_LDEXP_F16 requires a change to llvm.amdgcn.ldexp intrinsic.
+defm V_LDEXP_F16 : VOP2Inst <"v_ldexp_f16", VOP_F16_F16_I16/*, AMDGPUldexp*/>;
----------------
arsenm wrote:
> Why is this necessary? It can already mangle the FP type. This should work if you change it to VOP_F16_F16_I32, the int type doesn't matter
Would this be acceptable to use VOP_F16_F16_I32? Spec says:
```
D.f16 = S0.f16 * (2 ** S1.i16)
```
So it should be i16. I have a follow up patch that changes ldexp intrinsic.

https://reviews.llvm.org/D25975