[PATCH] D10964: [Codegen] Add intrinsics 'hsum*' and corresponding SDNodes for horizontal sum operation.

Shahid via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 18 23:55:21 PDT 2015


ashahid added inline comments.

================
Comment at: test/CodeGen/X86/vec-hadd-int-256.ll:14
@@ +13,2 @@
+  ret i64 %1
+}
----------------
RKSimon wrote:
> This codegen is the same as for the test1_hsum_int_i64 <2x i64> version in vec-hadd-int-128.ll - something is going wrong. You probably should compare against codegen from a AVX2 target.
With AVX2 the generated code differ as below.

**Case V2i64**
        vpshufd $78, %xmm0, %xmm1       # xmm1 = xmm0[2,3,0,1]
        vpaddq  %xmm1, %xmm0, %xmm0
        vmovq   %xmm0, %rax
        retq


**Case V4i64**
        vextracti128    $1, %ymm0, %xmm1
        vpaddq  %ymm1, %ymm0, %ymm0
        vpermq  $237, %ymm0, %ymm1      # ymm1 = ymm0[1,3,2,3]
        vpaddq  %ymm1, %ymm0, %ymm0
        vmovq   %xmm0, %rax
        vzeroupper
        retq


http://reviews.llvm.org/D10964





More information about the llvm-commits mailing list