[PATCH] [X86] Teach how to custom lower double-to-half conversions under fast-math.

Mon Feb 23 11:12:49 PST 2015

Hi Ahmed,

Thanks for the review!
Here is a updated patch. As you suggested, I moved the expansion of FP_TO_FP16 into the target independed dag legalizer.
I also updated the test adding extra CHECK lines for the 'long double-to-float' conversion.

The 'long double-to-float conversion' is currently implemented by the sequence: fldt+fstps+movss.
Basically, the long double in input to the function is firstly pushed onto the top of the x87 FPU register stack, and then 'popped' from the FPU stack onto the stack again. Finally, it is loaded as a float using a zero extending movss. The code sequence looks a bit redundant but it works :-)

Please let me know what you think.

Thanks!
-Andrea


http://reviews.llvm.org/D7832

Files:
  lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
  test/CodeGen/X86/fastmath-float-half-conversion.ll

Index: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
===================================================================

--- lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -3548,6 +3548,20 @@
     break;
   }
   case ISD::FP_TO_FP16: {
+    if (!TM.Options.UseSoftFloat && TM.Options.UnsafeFPMath) {
+      SDValue Op = Node->getOperand(0);
+      MVT SVT = Op.getSimpleValueType();
+      if (SVT == MVT::f64 || SVT == MVT::f80) {
+        // Under fastmath, we can expand this node into a fround followed by
+        // a float-half conversion.
+        SDValue FloatVal = DAG.getNode(ISD::FP_ROUND, dl, MVT::f32, Op,
+                                       DAG.getIntPtrConstant(0));
+        Results.push_back(
+            DAG.getNode(ISD::FP_TO_FP16, dl, MVT::i16, FloatVal));
+        break;
+      }
+    }
+
     RTLIB::Libcall LC =
         RTLIB::getFPROUND(Node->getOperand(0).getValueType(), MVT::f16);
     assert(LC != RTLIB::UNKNOWN_LIBCALL && "Unable to expand fp_to_fp16");
Index: test/CodeGen/X86/fastmath-float-half-conversion.ll
===================================================================
--- test/CodeGen/X86/fastmath-float-half-conversion.ll
+++ test/CodeGen/X86/fastmath-float-half-conversion.ll
@@ -0,0 +1,49 @@
+; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s | FileCheck %s
+
+define zeroext i16 @test1_fast(double %d) #0 {
+; CHECK-LABEL: test1_fast:
+; CHECK-NOT: callq  {{_+}}truncdfhf2
+; CHECK: vcvtsd2ss %xmm0, %xmm0, %xmm0
+; CHECK-NEXT: vcvtps2ph $0, %xmm0, %xmm0
+; CHECK: ret
+entry:
+  %0 = tail call i16 @llvm.convert.to.fp16.f64(double %d)
+  ret i16 %0
+}
+
+define zeroext i16 @test2_fast(x86_fp80 %d) #0 {
+; CHECK-LABEL: test2_fast:
+; CHECK-NOT: callq  {{_+}}truncxfhf2
+; CHECK: fldt
+; CHECK-NEXT: fstps
+; CHECK-NEXT: vmovss
+; CHECK-NEXT: vcvtps2ph $0, %xmm0, %xmm0
+; CHECK: ret
+entry:
+  %0 = tail call i16 @llvm.convert.to.fp16.f80(x86_fp80 %d)
+  ret i16 %0
+}
+
+define zeroext i16 @test1(double %d) #1 {
+; CHECK-LABEL: test1:
+; CHECK: callq  {{_+}}truncdfhf2
+; CHECK: ret
+entry:
+  %0 = tail call i16 @llvm.convert.to.fp16.f64(double %d)
+  ret i16 %0
+}
+
+define zeroext i16 @test2(x86_fp80 %d) #1 {
+; CHECK-LABEL: test2:
+; CHECK: callq  {{_+}}truncxfhf2
+; CHECK: ret
+entry:
+  %0 = tail call i16 @llvm.convert.to.fp16.f80(x86_fp80 %d)
+  ret i16 %0
+}
+
+declare i16 @llvm.convert.to.fp16.f64(double)
+declare i16 @llvm.convert.to.fp16.f80(x86_fp80)
+
+attributes #0 = { nounwind readnone uwtable "unsafe-fp-math"="true" "use-soft-float"="false" }
+attributes #1 = { nounwind readnone uwtable "unsafe-fp-math"="false" "use-soft-float"="false" }

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D7832.20524.patch
Type: text/x-patch
Size: 2670 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150223/2533b5be/attachment.bin>