<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">So that patch was mainly about matching the behaviour of _mm_cvttsd_epi32 (cvttsd2si) which already did the scalar equivalent.<div class=""><br class=""></div><div class="">It turns out there is a problem if constant folding of an out of range values occurs as LangRef says it should be undefined (although it actually sets the result to zero). But the cvttsd2si/cvttps2dq/cvttpd2dq instructions guarantee that the result is actually 0x80000000.</div><div class=""><br class=""></div><div class="">I’ll prepare a reversion patch that reverts these changes (including for the scalar) and performs a fast-math only combine instead.</div><div class=""><br class=""></div><div class="">Thanks, Simon.<br class=""><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 6 Jul 2016, at 08:16, Eli Friedman <<a href="mailto:eli.friedman@gmail.com" class="">eli.friedman@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="">Sorry this review comment is a little late... but doesn't LangRef say "If the
value won’t fit in the integer type, the results are undefined."?<br class=""><br class=""></div>-Eli<br class=""><div class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On Thu, Jun 2, 2016 at 3:55 AM, Simon Pilgrim via llvm-commits <span dir="ltr" class=""><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank" class="">llvm-commits@lists.llvm.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: rksimon<br class="">
Date: Thu Jun 2 05:55:21 2016<br class="">
New Revision: 271510<br class="">
<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=271510&view=rev" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project?rev=271510&view=rev</a><br class="">
Log:<br class="">
[X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) f32/f64 to i32 with generic IR (llvm)<br class="">
<br class="">
This patch removes the llvm intrinsics (V)CVTTPS2DQ and VCVTTPD2DQ truncation (round to zero) conversions and auto-upgrades to FP_TO_SINT calls instead.<br class="">
<br class="">
Note: I looked at updating CVTTPD2DQ as well but this still requires a lot more work to correctly lower.<br class="">
<br class="">
Differential Revision: <a href="http://reviews.llvm.org/D20860" rel="noreferrer" target="_blank" class="">http://reviews.llvm.org/D20860</a><br class="">
<br class="">
Modified:<br class="">
llvm/trunk/include/llvm/IR/IntrinsicsX86.td<br class="">
llvm/trunk/lib/IR/AutoUpgrade.cpp<br class="">
llvm/trunk/lib/Target/X86/X86InstrSSE.td<br class="">
llvm/trunk/test/CodeGen/X86/avx-intrinsics-fast-isel.ll<br class="">
llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll<br class="">
llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll<br class="">
llvm/trunk/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll<br class="">
llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll<br class="">
llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll<br class="">
<br class="">
Modified: llvm/trunk/include/llvm/IR/IntrinsicsX86.td<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IntrinsicsX86.td?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IntrinsicsX86.td?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/include/llvm/IR/IntrinsicsX86.td (original)<br class="">
+++ llvm/trunk/include/llvm/IR/IntrinsicsX86.td Thu Jun 2 05:55:21 2016<br class="">
@@ -488,8 +488,6 @@ let TargetPrefix = "x86" in { // All in<br class="">
Intrinsic<[llvm_v4f32_ty], [llvm_v2f64_ty], [IntrNoMem]>;<br class="">
def int_x86_sse2_cvtps2dq : GCCBuiltin<"__builtin_ia32_cvtps2dq">,<br class="">
Intrinsic<[llvm_v4i32_ty], [llvm_v4f32_ty], [IntrNoMem]>;<br class="">
- def int_x86_sse2_cvttps2dq : GCCBuiltin<"__builtin_ia32_cvttps2dq">,<br class="">
- Intrinsic<[llvm_v4i32_ty], [llvm_v4f32_ty], [IntrNoMem]>;<br class="">
def int_x86_sse2_cvtsd2si : GCCBuiltin<"__builtin_ia32_cvtsd2si">,<br class="">
Intrinsic<[llvm_i32_ty], [llvm_v2f64_ty], [IntrNoMem]>;<br class="">
def int_x86_sse2_cvtsd2si64 : GCCBuiltin<"__builtin_ia32_cvtsd2si64">,<br class="">
@@ -1725,12 +1723,8 @@ let TargetPrefix = "x86" in { // All in<br class="">
Intrinsic<[llvm_v4f32_ty], [llvm_v4f64_ty], [IntrNoMem]>;<br class="">
def int_x86_avx_cvt_ps2dq_256 : GCCBuiltin<"__builtin_ia32_cvtps2dq256">,<br class="">
Intrinsic<[llvm_v8i32_ty], [llvm_v8f32_ty], [IntrNoMem]>;<br class="">
- def int_x86_avx_cvtt_pd2dq_256 : GCCBuiltin<"__builtin_ia32_cvttpd2dq256">,<br class="">
- Intrinsic<[llvm_v4i32_ty], [llvm_v4f64_ty], [IntrNoMem]>;<br class="">
def int_x86_avx_cvt_pd2dq_256 : GCCBuiltin<"__builtin_ia32_cvtpd2dq256">,<br class="">
Intrinsic<[llvm_v4i32_ty], [llvm_v4f64_ty], [IntrNoMem]>;<br class="">
- def int_x86_avx_cvtt_ps2dq_256 : GCCBuiltin<"__builtin_ia32_cvttps2dq256">,<br class="">
- Intrinsic<[llvm_v8i32_ty], [llvm_v8f32_ty], [IntrNoMem]>;<br class="">
}<br class="">
<br class="">
// Vector bit test<br class="">
<br class="">
Modified: llvm/trunk/lib/IR/AutoUpgrade.cpp<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/AutoUpgrade.cpp?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/AutoUpgrade.cpp?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/lib/IR/AutoUpgrade.cpp (original)<br class="">
+++ llvm/trunk/lib/IR/AutoUpgrade.cpp Thu Jun 2 05:55:21 2016<br class="">
@@ -185,6 +185,8 @@ static bool UpgradeIntrinsicFunction1(Fu<br class="">
Name == "x86.sse2.cvtps2pd" ||<br class="">
Name == "x86.avx.cvtdq2.pd.256" ||<br class="">
Name == "x86.avx.cvt.ps2.pd.256" ||<br class="">
+ Name == "x86.sse2.cvttps2dq" ||<br class="">
+ Name.startswith("x86.avx.cvtt.") ||<br class="">
Name.startswith("x86.avx.vinsertf128.") ||<br class="">
Name == "x86.avx2.vinserti128" ||<br class="">
Name.startswith("x86.avx.vextractf128.") ||<br class="">
@@ -498,6 +500,12 @@ void llvm::UpgradeIntrinsicCall(CallInst<br class="">
Rep = Builder.CreateSIToFP(Rep, DstTy, "cvtdq2pd");<br class="">
else<br class="">
Rep = Builder.CreateFPExt(Rep, DstTy, "cvtps2pd");<br class="">
+ } else if (Name == "llvm.x86.sse2.cvttps2dq" ||<br class="">
+ Name.startswith("llvm.x86.avx.cvtt.")) {<br class="">
+ // Truncation (round to zero) float/double to i32 vector conversion.<br class="">
+ Value *Src = CI->getArgOperand(0);<br class="">
+ VectorType *DstTy = cast<VectorType>(CI->getType());<br class="">
+ Rep = Builder.CreateFPToSI(Src, DstTy, "cvtt");<br class="">
} else if (Name.startswith("llvm.x86.avx.movnt.")) {<br class="">
Module *M = F->getParent();<br class="">
SmallVector<Metadata *, 1> Elts;<br class="">
<br class="">
Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original)<br class="">
+++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Thu Jun 2 05:55:21 2016<br class="">
@@ -2013,35 +2013,24 @@ def CVTPD2DQrr : SDI<0xE6, MRMSrcReg, (<br class="">
// SSE2 packed instructions with XS prefix<br class="">
def VCVTTPS2DQrr : VS2SI<0x5B, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src),<br class="">
"cvttps2dq\t{$src, $dst|$dst, $src}",<br class="">
- [(set VR128:$dst,<br class="">
- (int_x86_sse2_cvttps2dq VR128:$src))],<br class="">
- IIC_SSE_CVT_PS_RR>, VEX, Sched<[WriteCvtF2I]>;<br class="">
+ [], IIC_SSE_CVT_PS_RR>, VEX, Sched<[WriteCvtF2I]>;<br class="">
def VCVTTPS2DQrm : VS2SI<0x5B, MRMSrcMem, (outs VR128:$dst), (ins f128mem:$src),<br class="">
"cvttps2dq\t{$src, $dst|$dst, $src}",<br class="">
- [(set VR128:$dst, (int_x86_sse2_cvttps2dq<br class="">
- (loadv4f32 addr:$src)))],<br class="">
- IIC_SSE_CVT_PS_RM>, VEX, Sched<[WriteCvtF2ILd]>;<br class="">
+ [], IIC_SSE_CVT_PS_RM>, VEX, Sched<[WriteCvtF2ILd]>;<br class="">
def VCVTTPS2DQYrr : VS2SI<0x5B, MRMSrcReg, (outs VR256:$dst), (ins VR256:$src),<br class="">
"cvttps2dq\t{$src, $dst|$dst, $src}",<br class="">
- [(set VR256:$dst,<br class="">
- (int_x86_avx_cvtt_ps2dq_256 VR256:$src))],<br class="">
- IIC_SSE_CVT_PS_RR>, VEX, VEX_L, Sched<[WriteCvtF2I]>;<br class="">
+ [], IIC_SSE_CVT_PS_RR>, VEX, VEX_L, Sched<[WriteCvtF2I]>;<br class="">
def VCVTTPS2DQYrm : VS2SI<0x5B, MRMSrcMem, (outs VR256:$dst), (ins f256mem:$src),<br class="">
"cvttps2dq\t{$src, $dst|$dst, $src}",<br class="">
- [(set VR256:$dst, (int_x86_avx_cvtt_ps2dq_256<br class="">
- (loadv8f32 addr:$src)))],<br class="">
- IIC_SSE_CVT_PS_RM>, VEX, VEX_L,<br class="">
+ [], IIC_SSE_CVT_PS_RM>, VEX, VEX_L,<br class="">
Sched<[WriteCvtF2ILd]>;<br class="">
<br class="">
def CVTTPS2DQrr : S2SI<0x5B, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src),<br class="">
"cvttps2dq\t{$src, $dst|$dst, $src}",<br class="">
- [(set VR128:$dst, (int_x86_sse2_cvttps2dq VR128:$src))],<br class="">
- IIC_SSE_CVT_PS_RR>, Sched<[WriteCvtF2I]>;<br class="">
+ [], IIC_SSE_CVT_PS_RR>, Sched<[WriteCvtF2I]>;<br class="">
def CVTTPS2DQrm : S2SI<0x5B, MRMSrcMem, (outs VR128:$dst), (ins f128mem:$src),<br class="">
"cvttps2dq\t{$src, $dst|$dst, $src}",<br class="">
- [(set VR128:$dst,<br class="">
- (int_x86_sse2_cvttps2dq (memopv4f32 addr:$src)))],<br class="">
- IIC_SSE_CVT_PS_RM>, Sched<[WriteCvtF2ILd]>;<br class="">
+ [], IIC_SSE_CVT_PS_RM>, Sched<[WriteCvtF2ILd]>;<br class="">
<br class="">
let Predicates = [HasAVX] in {<br class="">
def : Pat<(int_x86_sse2_cvtdq2ps VR128:$src),<br class="">
@@ -2111,14 +2100,10 @@ def VCVTTPD2DQXrm : VPDI<0xE6, MRMSrcMem<br class="">
// YMM only<br class="">
def VCVTTPD2DQYrr : VPDI<0xE6, MRMSrcReg, (outs VR128:$dst), (ins VR256:$src),<br class="">
"cvttpd2dq{y}\t{$src, $dst|$dst, $src}",<br class="">
- [(set VR128:$dst,<br class="">
- (int_x86_avx_cvtt_pd2dq_256 VR256:$src))],<br class="">
- IIC_SSE_CVT_PD_RR>, VEX, VEX_L, Sched<[WriteCvtF2I]>;<br class="">
+ [], IIC_SSE_CVT_PD_RR>, VEX, VEX_L, Sched<[WriteCvtF2I]>;<br class="">
def VCVTTPD2DQYrm : VPDI<0xE6, MRMSrcMem, (outs VR128:$dst), (ins f256mem:$src),<br class="">
"cvttpd2dq{y}\t{$src, $dst|$dst, $src}",<br class="">
- [(set VR128:$dst,<br class="">
- (int_x86_avx_cvtt_pd2dq_256 (loadv4f64 addr:$src)))],<br class="">
- IIC_SSE_CVT_PD_RM>, VEX, VEX_L, Sched<[WriteCvtF2ILd]>;<br class="">
+ [], IIC_SSE_CVT_PD_RM>, VEX, VEX_L, Sched<[WriteCvtF2ILd]>;<br class="">
def : InstAlias<"vcvttpd2dq\t{$src, $dst|$dst, $src}",<br class="">
(VCVTTPD2DQYrr VR128:$dst, VR256:$src), 0>;<br class="">
<br class="">
<br class="">
Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-fast-isel.ll<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-fast-isel.ll?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-fast-isel.ll?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/test/CodeGen/X86/avx-intrinsics-fast-isel.ll (original)<br class="">
+++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-fast-isel.ll Thu Jun 2 05:55:21 2016<br class="">
@@ -675,11 +675,10 @@ define <2 x i64> @test_mm256_cvttpd_epi3<br class="">
; X64-NEXT: vcvttpd2dqy %ymm0, %xmm0<br class="">
; X64-NEXT: vzeroupper<br class="">
; X64-NEXT: retq<br class="">
- %cvt = call <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double> %a0)<br class="">
+ %cvt = fptosi <4 x double> %a0 to <4 x i32><br class="">
%res = bitcast <4 x i32> %cvt to <2 x i64><br class="">
ret <2 x i64> %res<br class="">
}<br class="">
-declare <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double>) nounwind readnone<br class="">
<br class="">
define <4 x i64> @test_mm256_cvttps_epi32(<8 x float> %a0) nounwind {<br class="">
; X32-LABEL: test_mm256_cvttps_epi32:<br class="">
@@ -691,11 +690,10 @@ define <4 x i64> @test_mm256_cvttps_epi3<br class="">
; X64: # BB#0:<br class="">
; X64-NEXT: vcvttps2dq %ymm0, %ymm0<br class="">
; X64-NEXT: retq<br class="">
- %cvt = call <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float> %a0)<br class="">
+ %cvt = fptosi <8 x float> %a0 to <8 x i32><br class="">
%res = bitcast <8 x i32> %cvt to <4 x i64><br class="">
ret <4 x i64> %res<br class="">
}<br class="">
-declare <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float>) nounwind readnone<br class="">
<br class="">
define <4 x double> @test_mm256_div_pd(<4 x double> %a0, <4 x double> %a1) nounwind {<br class="">
; X32-LABEL: test_mm256_div_pd:<br class="">
<br class="">
Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll (original)<br class="">
+++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll Thu Jun 2 05:55:21 2016<br class="">
@@ -357,12 +357,35 @@ define <4 x double> @test_x86_avx_cvt_ps<br class="">
declare <4 x double> @llvm.x86.avx.cvt.ps2.pd.256(<4 x float>) nounwind readnone<br class="">
<br class="">
<br class="">
+define <4 x i32> @test_x86_avx_cvtt_pd2dq_256(<4 x double> %a0) {<br class="">
+; CHECK-LABEL: test_x86_avx_cvtt_pd2dq_256:<br class="">
+; CHECK: ## BB#0:<br class="">
+; CHECK-NEXT: vcvttpd2dqy %ymm0, %xmm0<br class="">
+; CHECK-NEXT: vzeroupper<br class="">
+; CHECK-NEXT: retl<br class="">
+ %res = call <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double> %a0) ; <<4 x i32>> [#uses=1]<br class="">
+ ret <4 x i32> %res<br class="">
+}<br class="">
+declare <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double>) nounwind readnone<br class="">
+<br class="">
+<br class="">
+define <8 x i32> @test_x86_avx_cvtt_ps2dq_256(<8 x float> %a0) {<br class="">
+; CHECK-LABEL: test_x86_avx_cvtt_ps2dq_256:<br class="">
+; CHECK: ## BB#0:<br class="">
+; CHECK-NEXT: vcvttps2dq %ymm0, %ymm0<br class="">
+; CHECK-NEXT: retl<br class="">
+ %res = call <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float> %a0) ; <<8 x i32>> [#uses=1]<br class="">
+ ret <8 x i32> %res<br class="">
+}<br class="">
+declare <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float>) nounwind readnone<br class="">
+<br class="">
+<br class="">
define void @test_x86_sse2_storeu_dq(i8* %a0, <16 x i8> %a1) {<br class="">
; add operation forces the execution domain.<br class="">
; CHECK-LABEL: test_x86_sse2_storeu_dq:<br class="">
; CHECK: ## BB#0:<br class="">
; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax<br class="">
-; CHECK-NEXT: vpaddb LCPI32_0, %xmm0, %xmm0<br class="">
+; CHECK-NEXT: vpaddb LCPI34_0, %xmm0, %xmm0<br class="">
; CHECK-NEXT: vmovdqu %xmm0, (%eax)<br class="">
; CHECK-NEXT: retl<br class="">
%a2 = add <16 x i8> %a1, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1><br class="">
<br class="">
Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll (original)<br class="">
+++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Thu Jun 2 05:55:21 2016<br class="">
@@ -3407,39 +3407,6 @@ define <8 x float> @test_x86_avx_cvtdq2_<br class="">
declare <8 x float> @llvm.x86.avx.cvtdq2.ps.256(<8 x i32>) nounwind readnone<br class="">
<br class="">
<br class="">
-define <4 x i32> @test_x86_avx_cvtt_pd2dq_256(<4 x double> %a0) {<br class="">
-; AVX-LABEL: test_x86_avx_cvtt_pd2dq_256:<br class="">
-; AVX: ## BB#0:<br class="">
-; AVX-NEXT: vcvttpd2dqy %ymm0, %xmm0<br class="">
-; AVX-NEXT: vzeroupper<br class="">
-; AVX-NEXT: retl<br class="">
-;<br class="">
-; AVX512VL-LABEL: test_x86_avx_cvtt_pd2dq_256:<br class="">
-; AVX512VL: ## BB#0:<br class="">
-; AVX512VL-NEXT: vcvttpd2dqy %ymm0, %xmm0<br class="">
-; AVX512VL-NEXT: retl<br class="">
- %res = call <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double> %a0) ; <<4 x i32>> [#uses=1]<br class="">
- ret <4 x i32> %res<br class="">
-}<br class="">
-declare <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double>) nounwind readnone<br class="">
-<br class="">
-<br class="">
-define <8 x i32> @test_x86_avx_cvtt_ps2dq_256(<8 x float> %a0) {<br class="">
-; AVX-LABEL: test_x86_avx_cvtt_ps2dq_256:<br class="">
-; AVX: ## BB#0:<br class="">
-; AVX-NEXT: vcvttps2dq %ymm0, %ymm0<br class="">
-; AVX-NEXT: retl<br class="">
-;<br class="">
-; AVX512VL-LABEL: test_x86_avx_cvtt_ps2dq_256:<br class="">
-; AVX512VL: ## BB#0:<br class="">
-; AVX512VL-NEXT: vcvttps2dq %ymm0, %ymm0<br class="">
-; AVX512VL-NEXT: retl<br class="">
- %res = call <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float> %a0) ; <<8 x i32>> [#uses=1]<br class="">
- ret <8 x i32> %res<br class="">
-}<br class="">
-declare <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float>) nounwind readnone<br class="">
-<br class="">
-<br class="">
define <8 x float> @test_x86_avx_dp_ps_256(<8 x float> %a0, <8 x float> %a1) {<br class="">
; AVX-LABEL: test_x86_avx_dp_ps_256:<br class="">
; AVX: ## BB#0:<br class="">
@@ -4133,7 +4100,7 @@ define <4 x double> @test_x86_avx_vpermi<br class="">
;<br class="">
; AVX512VL-LABEL: test_x86_avx_vpermilvar_pd_256_2:<br class="">
; AVX512VL: ## BB#0:<br class="">
-; AVX512VL-NEXT: vpermilpd LCPI233_0, %ymm0, %ymm0<br class="">
+; AVX512VL-NEXT: vpermilpd LCPI231_0, %ymm0, %ymm0<br class="">
; AVX512VL-NEXT: retl<br class="">
%res = call <4 x double> @llvm.x86.avx.vpermilvar.pd.256(<4 x double> %a0, <4 x i64> <i64 2, i64 0, i64 0, i64 2>) ; <<4 x double>> [#uses=1]<br class="">
ret <4 x double> %res<br class="">
@@ -4625,7 +4592,7 @@ define void @movnt_dq(i8* %p, <2 x i64><br class="">
; AVX-LABEL: movnt_dq:<br class="">
; AVX: ## BB#0:<br class="">
; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax<br class="">
-; AVX-NEXT: vpaddq LCPI260_0, %xmm0, %xmm0<br class="">
+; AVX-NEXT: vpaddq LCPI258_0, %xmm0, %xmm0<br class="">
; AVX-NEXT: vmovntdq %ymm0, (%eax)<br class="">
; AVX-NEXT: vzeroupper<br class="">
; AVX-NEXT: retl<br class="">
@@ -4633,7 +4600,7 @@ define void @movnt_dq(i8* %p, <2 x i64><br class="">
; AVX512VL-LABEL: movnt_dq:<br class="">
; AVX512VL: ## BB#0:<br class="">
; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax<br class="">
-; AVX512VL-NEXT: vpaddq LCPI260_0, %xmm0, %xmm0<br class="">
+; AVX512VL-NEXT: vpaddq LCPI258_0, %xmm0, %xmm0<br class="">
; AVX512VL-NEXT: vmovntdq %ymm0, (%eax)<br class="">
; AVX512VL-NEXT: retl<br class="">
%a2 = add <2 x i64> %a1, <i64 1, i64 1><br class="">
<br class="">
Modified: llvm/trunk/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll (original)<br class="">
+++ llvm/trunk/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll Thu Jun 2 05:55:21 2016<br class="">
@@ -1280,11 +1280,10 @@ define <2 x i64> @test_mm_cvttps_epi32(<<br class="">
; X64: # BB#0:<br class="">
; X64-NEXT: cvttps2dq %xmm0, %xmm0<br class="">
; X64-NEXT: retq<br class="">
- %res = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %a0)<br class="">
+ %res = fptosi <4 x float> %a0 to <4 x i32><br class="">
%bc = bitcast <4 x i32> %res to <2 x i64><br class="">
ret <2 x i64> %bc<br class="">
}<br class="">
-declare <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float>) nounwind readnone<br class="">
<br class="">
define i32 @test_mm_cvttsd_si32(<2 x double> %a0) nounwind {<br class="">
; X32-LABEL: test_mm_cvttsd_si32:<br class="">
<br class="">
Modified: llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll (original)<br class="">
+++ llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll Thu Jun 2 05:55:21 2016<br class="">
@@ -84,6 +84,17 @@ define <2 x double> @test_x86_sse2_cvtps<br class="">
declare <2 x double> @llvm.x86.sse2.cvtps2pd(<4 x float>) nounwind readnone<br class="">
<br class="">
<br class="">
+define <4 x i32> @test_x86_sse2_cvttps2dq(<4 x float> %a0) {<br class="">
+; CHECK-LABEL: test_x86_sse2_cvttps2dq:<br class="">
+; CHECK: ## BB#0:<br class="">
+; CHECK-NEXT: cvttps2dq %xmm0, %xmm0<br class="">
+; CHECK-NEXT: retl<br class="">
+ %res = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %a0) ; <<4 x i32>> [#uses=1]<br class="">
+ ret <4 x i32> %res<br class="">
+}<br class="">
+declare <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float>) nounwind readnone<br class="">
+<br class="">
+<br class="">
define void @test_x86_sse2_storel_dq(i8* %a0, <4 x i32> %a1) {<br class="">
; CHECK-LABEL: test_x86_sse2_storel_dq:<br class="">
; CHECK: ## BB#0:<br class="">
@@ -101,7 +112,7 @@ define void @test_x86_sse2_storeu_dq(i8*<br class="">
; CHECK-LABEL: test_x86_sse2_storeu_dq:<br class="">
; CHECK: ## BB#0:<br class="">
; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax<br class="">
-; CHECK-NEXT: paddb LCPI7_0, %xmm0<br class="">
+; CHECK-NEXT: paddb LCPI8_0, %xmm0<br class="">
; CHECK-NEXT: movdqu %xmm0, (%eax)<br class="">
; CHECK-NEXT: retl<br class="">
%a2 = add <16 x i8> %a1, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1><br class="">
<br class="">
Modified: llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll?rev=271510&r1=271509&r2=271510&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll?rev=271510&r1=271509&r2=271510&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll (original)<br class="">
+++ llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll Thu Jun 2 05:55:21 2016<br class="">
@@ -322,22 +322,6 @@ define <4 x i32> @test_x86_sse2_cvttpd2d<br class="">
declare <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double>) nounwind readnone<br class="">
<br class="">
<br class="">
-define <4 x i32> @test_x86_sse2_cvttps2dq(<4 x float> %a0) {<br class="">
-; SSE-LABEL: test_x86_sse2_cvttps2dq:<br class="">
-; SSE: ## BB#0:<br class="">
-; SSE-NEXT: cvttps2dq %xmm0, %xmm0<br class="">
-; SSE-NEXT: retl<br class="">
-;<br class="">
-; KNL-LABEL: test_x86_sse2_cvttps2dq:<br class="">
-; KNL: ## BB#0:<br class="">
-; KNL-NEXT: vcvttps2dq %xmm0, %xmm0<br class="">
-; KNL-NEXT: retl<br class="">
- %res = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %a0) ; <<4 x i32>> [#uses=1]<br class="">
- ret <4 x i32> %res<br class="">
-}<br class="">
-declare <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float>) nounwind readnone<br class="">
-<br class="">
-<br class="">
define i32 @test_x86_sse2_cvttsd2si(<2 x double> %a0) {<br class="">
; SSE-LABEL: test_x86_sse2_cvttsd2si:<br class="">
; SSE: ## BB#0:<br class="">
<br class="">
<br class="">
_______________________________________________<br class="">
llvm-commits mailing list<br class="">
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank" class="">llvm-commits@lists.llvm.org</a><br class="">
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br class="">
</blockquote></div><br class=""></div></div></div>
</div></blockquote></div><br class=""></div></div></body></html>