[PATCH] D28496: [AMDGPU] Implement f16 fcopysign

Mon Jan 9 15:32:06 PST 2017

arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:313-314
     setOperationAction(ISD::FDIV, MVT::f16, Custom);
+    if (!Subtarget->hasBFI())
+      setOperationAction(ISD::FCOPYSIGN, MVT::f16, Expand);

----------------
This shouldn't be necessary. This will only be false for R600 targets and the default is expand for illegal types

================
Comment at: lib/Target/AMDGPU/SIInstructions.td:680-683
 def : Pat <
+  (fcopysign f16:$src0, f16:$src1),
+  (V_BFI_B32 (S_MOV_B32 (i32 0x00007fff)), $src0, $src1)
+>;
----------------
It is possible to have mismatched FP types for src0 and src1. If you can come up with a testcases combined with the fp casts you might need that too

================
Comment at: test/CodeGen/AMDGPU/fcopysign.f16.ll:1
+; RUN: llc -march=amdgcn -mcpu=SI -verify-machineinstrs < %s | FileCheck -check-prefix=SI -check-prefix=GCN -check-prefix=FUNC %s
+; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck -check-prefix=VI -check-prefix=GCN -check-prefix=FUNC %s
----------------
No -mcpu=SI

https://reviews.llvm.org/D28496