[PATCH] Lower certain build_vectors to insertps instructions
Elena Demikhovsky
elena.demikhovsky at intel.com
Sun Apr 27 01:03:45 PDT 2014
You see the right code because it is a small test and %xmm0 is used
Try to run this test
define < 4 x float> @test(<4 x float> %x) {
%vecext = extractelement <4 x float> %x, i32 0
%vecinit = insertelement <4 x float> undef, float %vecext, i32 0
%vecext1 = extractelement <4 x float> %x, i32 1
%vecinit2 = insertelement <4 x float> %vecinit, float %vecext1, i32 1
%vecext3 = extractelement <4 x float> %x, i32 2
%vecinit4 = insertelement <4 x float> %vecinit2, float %vecext3, i32 2
%vecinit5 = insertelement <4 x float> %vecinit4, float 0.0, i32 3
%mask = fcmp olt <4 x float> %vecinit5, %x
%res = select <4 x i1> %mask, <4 x float> %x, <4 x float>%vecinit5
ret <4 x float> %res
}
- Elena
http://reviews.llvm.org/D3521
More information about the llvm-commits
mailing list