[LLVMdev] Vectorized LLVM IR

Fri May 28 12:40:00 PDT 2010

Are your loads and stores 16byte aligned.

 %0 = load <4 x float>* %scevgep9.vec, align 16

If not, that could cause some slowdown, because unaligned moves will be used.

On Fri, May 28, 2010 at 2:13 PM, Stéphane Letz <letz at grame.fr> wrote:
> Hi,
>
> We are experimenting directly generating vectorized LLVM IR (using <8 x float> kind of types), then compiling the code to SSE on a 64 bits machine. Right now the equivalent code in scalar mode sill outperform the SSE one.
>
> What is the quality of the SSE support in X86 LLVL backend? Are they any specific things to be aware of to improve the speed?
>
> Thanks
>
> Stéphane Letz
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>