[LLVMdev] SIMD for sdiv <2 x i64>
zhi chen
zchenhn at gmail.com
Thu Jul 23 23:06:49 PDT 2015
It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM
3.4 generates very complicated code for the following IR. I am running on a
Haswell processor. Is it because there is no alternative AVX/2 instructions
for int64? The same thing also happens to zext <2 x i32> -> <2 x i64> and
trunc <2 x i64> -> <2 x i32>. Any ideas to optimize these instructions?
Thanks.
%sub.ptr.sub.i6.i.i.i.i = sub <2 x i64> %sub.ptr.lhs.cast.i4.i.i.i.i,
%sub.ptr.rhs.cast.i5.i.i.i.i
%sub.ptr.div.i7.i.i.i.i = sdiv <2 x i64> %sub.ptr.sub.i6.i.i.i.i, <i64 24,
i64 24>
Assembly:
vpsubq %xmm6, %xmm5, %xmm5
vmovq %xmm5, %rax
movabsq $3074457345618258603, %rbx # imm = 0x2AAAAAAAAAAAAAAB
imulq %rbx
movq %rdx, %rcx
movq %rcx, %rax
shrq $63, %rax
shrq $2, %rcx
addl %eax, %ecx
vpextrq $1, %xmm5, %rax
imulq %rbx
movq %rdx, %rax
shrq $63, %rax
shrq $2, %rdx
addl %eax, %edx
movslq %edx, %rax
vmovq %rax, %xmm5
movslq %ecx, %rax
vmovq %rax, %xmm6
vpunpcklqdq %xmm5, %xmm6, %xmm5 # xmm5 = xmm6[0],xmm5[0]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150723/4c853c43/attachment.html>
More information about the llvm-dev
mailing list