[llvm-commits] [llvm] r138768 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/uint_to_fp-2.ll
Eli Friedman
eli.friedman at gmail.com
Mon Aug 29 14:15:46 PDT 2011
Author: efriedma
Date: Mon Aug 29 16:15:46 2011
New Revision: 138768
URL: http://llvm.org/viewvc/llvm-project?rev=138768&view=rev
Log:
Explicitly zero out parts of a vector which are required to be zero by the algorithm in LowerUINT_TO_FP_i32. This only has a substantial effect on the generated code when the input is extracted from a vector register; other ways of loading an i32 do the appropriate zeroing implicitly. Fixes PR10802.
Modified:
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
llvm/trunk/test/CodeGen/X86/uint_to_fp-2.ll
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=138768&r1=138767&r2=138768&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Aug 29 16:15:46 2011
@@ -7713,6 +7713,9 @@
SDValue Load = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MVT::v4i32,
Op.getOperand(0));
+ // Zero out the upper parts of the register.
+ Load = getShuffleVectorZeroOrUndef(Load, 0, true, Subtarget->hasSSE2(), DAG);
+
Load = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, MVT::f64,
DAG.getNode(ISD::BITCAST, dl, MVT::v2f64, Load),
DAG.getIntPtrConstant(0));
Modified: llvm/trunk/test/CodeGen/X86/uint_to_fp-2.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/uint_to_fp-2.ll?rev=138768&r1=138767&r2=138768&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/uint_to_fp-2.ll (original)
+++ llvm/trunk/test/CodeGen/X86/uint_to_fp-2.ll Mon Aug 29 16:15:46 2011
@@ -1,8 +1,33 @@
-; RUN: llc < %s -march=x86 -mattr=+sse2 | grep movsd | count 1
-; rdar://6504833
+; RUN: llc < %s -march=x86 -mattr=+sse2 | FileCheck %s
-define float @f(i32 %x) nounwind readnone {
+; rdar://6504833
+define float @test1(i32 %x) nounwind readnone {
+; CHECK: test1
+; CHECK: movd
+; CHECK: orpd
+; CHECK: subsd
+; CHECK: cvtsd2ss
+; CHECK: movss
+; CHECK: flds
+; CHECK: ret
entry:
%0 = uitofp i32 %x to float
ret float %0
}
+
+; PR10802
+define float @test2(<4 x i32> %x) nounwind readnone ssp {
+; CHECK: test2
+; CHECK: xorps [[ZERO:%xmm[0-9]+]]
+; CHECK: movss {{.*}}, [[ZERO]]
+; CHECK: orps
+; CHECK: subsd
+; CHECK: cvtsd2ss
+; CHECK: movss
+; CHECK: flds
+; CHECK: ret
+entry:
+ %vecext = extractelement <4 x i32> %x, i32 0
+ %conv = uitofp i32 %vecext to float
+ ret float %conv
+}
More information about the llvm-commits
mailing list