[LLVMdev] unaligned AVX store gets split into two instructions

Tue Jul 9 21:57:13 PDT 2013

On Tue, Jul 09, 2013 at 09:01:48PM -0700, Zach Devito wrote:
> I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads
> on AVX.
> 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as
> a single instruction (details below).
> In a matrix-matrix inner-kernel, I see a ~25% decrease in performance,
> which seems to be due to this.
> 
> Any ideas why this changed? Thanks!
>

Hi Zack,

I ran into a similar problem with the R600 backend, and I was able to fix it
by implementing the TargetLowering::allowsUnalignedMemoryAccesses().
Take a look at r184822.

-Tom

> Zach
> 
> LLVM Code:
> define <4 x double> @vstore(<4 x double>*) {
> entry:
>   %1 = load <4 x double>* %0, align 8
>   ret <4 x double> %1
> }
> ------------------------------------------------------------
> Running llvm-32/bin/llc vstore.ll creates:
> .section __TEXT,__text,regular,pure_instructions
> .globl _vstore
> .align 4, 0x90
> _vstore:                                ## @vstore
> .cfi_startproc
> ## BB#0:                                ## %entry
> pushq %rbp
> Ltmp2:
> .cfi_def_cfa_offset 16
> Ltmp3:
> .cfi_offset %rbp, -16
> movq %rsp, %rbp
> Ltmp4:
> .cfi_def_cfa_register %rbp
> vmovups (%rdi), %ymm0
> popq %rbp
> ret
> .cfi_endproc
> ----------------------------------------------------------------
> Running llvm-33/bin/llc vstore.ll creates:
>         .section        __TEXT,__text,regular,pure_instructions
>         .globl  _main
>         .align  4, 0x90
> _main:                                  ## @main
>         .cfi_startproc
> ## BB#0:                                ## %entry
>         vmovups (%rdi), %xmm0
>         vinsertf128     $1, 16(%rdi), %ymm0, %ymm0
>         ret
>         .cfi_endproc
> 
> 
> .subsections_via_symbols

> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev