[PATCH] Lower certain build_vectors to insertps instructions

Elena Demikhovsky elena.demikhovsky at intel.com
Sun Apr 27 01:03:45 PDT 2014


You see the right code because it is a small test and %xmm0 is used
Try to run this test

define < 4 x float> @test(<4 x float> %x) {
  %vecext = extractelement <4 x float> %x, i32 0
  %vecinit = insertelement <4 x float> undef, float %vecext, i32 0
  %vecext1 = extractelement <4 x float> %x, i32 1
  %vecinit2 = insertelement <4 x float> %vecinit, float %vecext1, i32 1
  %vecext3 = extractelement <4 x float> %x, i32 2
  %vecinit4 = insertelement <4 x float> %vecinit2, float %vecext3, i32 2
  %vecinit5 = insertelement <4 x float> %vecinit4, float 0.0, i32 3
 %mask = fcmp olt <4 x float> %vecinit5, %x
  %res = select  <4 x i1> %mask, <4 x float> %x, <4 x float>%vecinit5
  ret <4 x float> %res
}

-  Elena

http://reviews.llvm.org/D3521






More information about the llvm-commits mailing list