[PATCH] use __builtin_convertvector for _mm_cvtepi32_ps to enable constant propagation
Matthias Kretz
kretz at kde.org
Tue Oct 1 00:36:55 PDT 2013
Testcase:
__m128 cvt0() { return _mm_cvtepi32_ps(_mm_set1_epi32(2)); }
__m128 cvt0(__m128i x) { return _mm_cvtepi32_ps(x); }
compiles to IR:
define <4 x float> @_Z4cvt0v() #0 {
entry:
ret <4 x float> <float 2.000000e+00, float 2.000000e+00, float 2.000000e+00, float 2.000000e+00>
}
define <4 x float> @_Z4cvt0Dv2_x(<2 x i64> %x) #0 {
entry:
%0 = bitcast <2 x i64> %x to <4 x i32>
%conv.i = sitofp <4 x i32> %0 to <4 x float>
ret <4 x float> %conv.i
}
and x86:
0000000000000000 <cvt0()>:
0:· c5 f8 28 05 00 00 00 00 · vmovaps 0x0(%rip),%xmm0 # 8 <cvt0()+0x8>· 4: R_X86_64_PC32· .LCPI0_0-0x4
8:· c3 · retq
0000000000000010 <cvt0(long long __vector(2))>:
10:· c5 f8 5b c0 · vcvtdq2ps %xmm0,%xmm0
14:· c3 · retq
http://llvm-reviews.chandlerc.com/D1792
More information about the cfe-commits
mailing list