[LLVMdev] Poor code generation for odd sized vectors

Tue Sep 27 04:33:48 PDT 2011

Hi all,

I'm compiling LLCM IR code like this on x86-64:

  define linkonce ccc <16 x float> @vector_add_float(<16 x float>  %a.78, <16 x float>  %a.79) align 8  
  {
  entry:
    %result.80 = fadd <16 x float> %a.78, %a.79
    ret <18 x float> %result.80
  }

This works really well when the vector length (16 in the above) is
an integer multiple of the SSE vector register width (4) resulting
in the following assember code:

    vector_add_float:                       # @vector_add_float
    .Leh_func_begin0:
    # BB#0:                                 # %entry
	addps	%xmm4, %xmm0
	addps	%xmm5, %xmm1
	addps	%xmm6, %xmm2
	addps	%xmm7, %xmm3
	ret

However, when the vector length is increased to say 18, the generated
code is rather poor, or rather is code that could easily be improved
by hand.

Is this a know issue? Should LLVM be doing better? SHould I raise a
bug?

Cheers,
Erik
-- 
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/