[LLVMdev] SIMD for sdiv <2 x i64>

zhi chen zchenhn at gmail.com
Thu Jul 23 23:06:49 PDT 2015


It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM
3.4 generates very complicated code for the following IR. I am running on a
Haswell processor. Is it because there is no alternative AVX/2 instructions
for int64? The same thing also happens to zext <2 x i32> -> <2 x i64> and
trunc <2 x i64> -> <2 x i32>. Any ideas to optimize these instructions?
Thanks.

%sub.ptr.sub.i6.i.i.i.i = sub <2 x i64> %sub.ptr.lhs.cast.i4.i.i.i.i,
%sub.ptr.rhs.cast.i5.i.i.i.i
%sub.ptr.div.i7.i.i.i.i = sdiv <2 x i64> %sub.ptr.sub.i6.i.i.i.i, <i64 24,
i64 24>

Assembly:
    vpsubq  %xmm6, %xmm5, %xmm5
    vmovq   %xmm5, %rax
    movabsq $3074457345618258603, %rbx # imm = 0x2AAAAAAAAAAAAAAB

    imulq   %rbx
    movq    %rdx, %rcx

    movq    %rcx, %rax

    shrq    $63, %rax

    shrq    $2, %rcx
    addl    %eax, %ecx
    vpextrq $1, %xmm5, %rax

    imulq   %rbx
    movq    %rdx, %rax

    shrq    $63, %rax

    shrq    $2, %rdx
    addl    %eax, %edx

    movslq  %edx, %rax
    vmovq   %rax, %xmm5

    movslq  %ecx, %rax
    vmovq   %rax, %xmm6
    vpunpcklqdq %xmm5, %xmm6, %xmm5 # xmm5 = xmm6[0],xmm5[0]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150723/4c853c43/attachment.html>


More information about the llvm-dev mailing list