[llvm] e445447 - [X86] When handling i64->f32 sint_to_fp on 32-bit targets only bitcast to f64 if sse2 is enabled.

Wed Jan 15 18:27:13 PST 2020

Author: Craig Topper
Date: 2020-01-15T18:26:28-08:00
New Revision: e4454479212b28532909e0a0782b0102e9bcd1c4

URL: https://github.com/llvm/llvm-project/commit/e4454479212b28532909e0a0782b0102e9bcd1c4
DIFF: https://github.com/llvm/llvm-project/commit/e4454479212b28532909e0a0782b0102e9bcd1c4.diff

LOG: [X86] When handling i64->f32 sint_to_fp on 32-bit targets only bitcast to f64 if sse2 is enabled.

The code is trying to copy the i64 value to an xmm register to
use a 64-bit store so that the 64-bit fild can benefit from
store forwarding.

But this trick only works if f64 is going to be stored in an
XMM register. If we only have SSE1 then only float is in xmm
register. So this trick just causes 2 stores i32 stores, an f64
load into the x87, an f64 from x87, and a 64-bit fild. So we end
up with an extra stack temporary and still didn't get store forwarding.

We might be able to use v2f32 here instead, but I didn't check. I
just wanted the code to make sense.

Found by inspection as I continue to stare too hard at our
int_to_fp conversions.

Added: 
    

Modified: 
    llvm/lib/Target/X86/X86ISelLowering.cpp
    llvm/test/CodeGen/X86/scalar-int-to-fp.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 0f152968ddfd..72fd9f70a1ed 100644

--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -18832,7 +18832,7 @@ SDValue X86TargetLowering::LowerSINT_TO_FP(SDValue Op,
     return LowerF128Call(Op, DAG, RTLIB::getSINTTOFP(SrcVT, VT));
 
   SDValue ValueToStore = Src;
-  if (SrcVT == MVT::i64 && UseSSEReg && !Subtarget.is64Bit())
+  if (SrcVT == MVT::i64 && Subtarget.hasSSE2() && !Subtarget.is64Bit())
     // Bitcasting to f64 here allows us to do a single 64-bit store from
     // an SSE register, avoiding the store forwarding penalty that would come
     // with two 32-bit stores.

diff  --git a/llvm/test/CodeGen/X86/scalar-int-to-fp.ll b/llvm/test/CodeGen/X86/scalar-int-to-fp.ll
index 03437f017dc8..67545a36168d 100644
--- a/llvm/test/CodeGen/X86/scalar-int-to-fp.ll
+++ b/llvm/test/CodeGen/X86/scalar-int-to-fp.ll
@@ -576,15 +576,13 @@ define float @s64_to_f_2(i64 %a) nounwind {
 ; SSE1_32-NEXT:    pushl %ebp
 ; SSE1_32-NEXT:    movl %esp, %ebp
 ; SSE1_32-NEXT:    andl $-8, %esp
-; SSE1_32-NEXT:    subl $24, %esp
+; SSE1_32-NEXT:    subl $16, %esp
 ; SSE1_32-NEXT:    movl 8(%ebp), %eax
 ; SSE1_32-NEXT:    movl 12(%ebp), %ecx
 ; SSE1_32-NEXT:    addl $5, %eax
-; SSE1_32-NEXT:    movl %eax, {{[0-9]+}}(%esp)
 ; SSE1_32-NEXT:    adcl $0, %ecx
+; SSE1_32-NEXT:    movl %eax, {{[0-9]+}}(%esp)
 ; SSE1_32-NEXT:    movl %ecx, {{[0-9]+}}(%esp)
-; SSE1_32-NEXT:    fldl {{[0-9]+}}(%esp)
-; SSE1_32-NEXT:    fstpl {{[0-9]+}}(%esp)
 ; SSE1_32-NEXT:    fildll {{[0-9]+}}(%esp)
 ; SSE1_32-NEXT:    fstps {{[0-9]+}}(%esp)
 ; SSE1_32-NEXT:    flds {{[0-9]+}}(%esp)