[PATCH] D10964: [Codegen] Add intrinsics 'hsum*' and corresponding SDNodes for horizontal sum operation.
Shahid via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 18 23:55:21 PDT 2015
ashahid added inline comments.
================
Comment at: test/CodeGen/X86/vec-hadd-int-256.ll:14
@@ +13,2 @@
+ ret i64 %1
+}
----------------
RKSimon wrote:
> This codegen is the same as for the test1_hsum_int_i64 <2x i64> version in vec-hadd-int-128.ll - something is going wrong. You probably should compare against codegen from a AVX2 target.
With AVX2 the generated code differ as below.
**Case V2i64**
vpshufd $78, %xmm0, %xmm1 # xmm1 = xmm0[2,3,0,1]
vpaddq %xmm1, %xmm0, %xmm0
vmovq %xmm0, %rax
retq
**Case V4i64**
vextracti128 $1, %ymm0, %xmm1
vpaddq %ymm1, %ymm0, %ymm0
vpermq $237, %ymm0, %ymm1 # ymm1 = ymm0[1,3,2,3]
vpaddq %ymm1, %ymm0, %ymm0
vmovq %xmm0, %rax
vzeroupper
retq
http://reviews.llvm.org/D10964
More information about the llvm-commits
mailing list