[PATCH] D10964: [Codegen] Add intrinsics 'hsum*' and corresponding SDNodes for horizontal sum operation.

Cong Hou via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 28 17:18:33 PDT 2015


congh added inline comments.

================
Comment at: test/CodeGen/X86/vec-hadd-int-128.ll:8
@@ +7,3 @@
+; CHECK:       # BB#0:
+; CHECK-NEXT:    pshufd {{.*#+}} xmm1 = xmm0[2,3,0,1]
+; CHECK-NEXT:    paddd %xmm0, %xmm1
----------------
davidxl wrote:
> The result does not look right -- should pshufb be generated instead?
I think shift operation is required here given we only have SSE2 support for x86_64.

================
Comment at: test/CodeGen/X86/vec-hadd-int-128.ll:24
@@ +23,3 @@
+; CHECK:       # BB#0:
+; CHECK-NEXT:    pshufd {{.*#+}} xmm1 = xmm0[2,3,0,1]
+; CHECK-NEXT:    paddd %xmm0, %xmm1
----------------
davidxl wrote:
> should phsufw be generated?
> 
> Or more efficient with phaddw?
In SSE2, pshuflw should be generated here. phaddw is introduced in SSSE3.


http://reviews.llvm.org/D10964





More information about the llvm-commits mailing list