[PATCH] D72302: [X86] Improve lowering of v2i64 sign bit tests on pre-sse4.2 targets

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 6 13:20:34 PST 2020


craig.topper created this revision.
craig.topper added reviewers: spatel, RKSimon.
Herald added a subscriber: hiraditya.
Herald added a project: LLVM.
craig.topper marked an inline comment as done.
craig.topper added inline comments.


================
Comment at: llvm/test/CodeGen/X86/bitcast-vector-bool.ll:432
 ; SSE2-SSSE3:       # %bb.0:
-; SSE2-SSSE3-NEXT:    movdqa {{.*#+}} xmm4 = [2147483648,2147483648]
-; SSE2-SSSE3-NEXT:    pxor %xmm4, %xmm3
-; SSE2-SSSE3-NEXT:    movdqa %xmm4, %xmm5
-; SSE2-SSSE3-NEXT:    pcmpgtd %xmm3, %xmm5
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2]
-; SSE2-SSSE3-NEXT:    pcmpeqd %xmm4, %xmm3
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT:    pand %xmm6, %xmm3
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3]
-; SSE2-SSSE3-NEXT:    por %xmm3, %xmm5
-; SSE2-SSSE3-NEXT:    pxor %xmm4, %xmm2
-; SSE2-SSSE3-NEXT:    movdqa %xmm4, %xmm3
-; SSE2-SSSE3-NEXT:    pcmpgtd %xmm2, %xmm3
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2]
-; SSE2-SSSE3-NEXT:    pcmpeqd %xmm4, %xmm2
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3]
-; SSE2-SSSE3-NEXT:    pand %xmm6, %xmm7
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT:    por %xmm7, %xmm2
-; SSE2-SSSE3-NEXT:    packssdw %xmm5, %xmm2
-; SSE2-SSSE3-NEXT:    pxor %xmm4, %xmm1
-; SSE2-SSSE3-NEXT:    movdqa %xmm4, %xmm3
-; SSE2-SSSE3-NEXT:    pcmpgtd %xmm1, %xmm3
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2]
-; SSE2-SSSE3-NEXT:    pcmpeqd %xmm4, %xmm1
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
-; SSE2-SSSE3-NEXT:    pand %xmm5, %xmm1
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT:    por %xmm1, %xmm3
-; SSE2-SSSE3-NEXT:    pxor %xmm4, %xmm0
-; SSE2-SSSE3-NEXT:    movdqa %xmm4, %xmm1
-; SSE2-SSSE3-NEXT:    pcmpgtd %xmm0, %xmm1
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2]
-; SSE2-SSSE3-NEXT:    pcmpeqd %xmm4, %xmm0
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3]
-; SSE2-SSSE3-NEXT:    pand %xmm5, %xmm0
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
-; SSE2-SSSE3-NEXT:    por %xmm0, %xmm1
-; SSE2-SSSE3-NEXT:    packssdw %xmm3, %xmm1
-; SSE2-SSSE3-NEXT:    packssdw %xmm2, %xmm1
-; SSE2-SSSE3-NEXT:    packsswb %xmm0, %xmm1
-; SSE2-SSSE3-NEXT:    pmovmskb %xmm1, %eax
+; SSE2-SSSE3-NEXT:    packssdw %xmm3, %xmm2
+; SSE2-SSSE3-NEXT:    packssdw %xmm1, %xmm0
----------------
The new code is simple enough that simplify demanded bits was able to get through it. The movmskb only needs the sign bits from its input and packss doesn't alter sign bits so it was able to prove the compare unnecessary.


Without sse4.2 a v2i64 setlt needs to expand into a pcmpgtd, pcmpeqd, 3 shuffles, and 2 logic ops. But if we're only interested in the sign bit of the i64 elements, we can just use one pcmpgtd and shuffle the odd elements to the even elements.


https://reviews.llvm.org/D72302

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/X86/bitcast-vector-bool.ll
  llvm/test/CodeGen/X86/movmsk-cmp.ll
  llvm/test/CodeGen/X86/sadd_sat_vec.ll
  llvm/test/CodeGen/X86/ssub_sat_vec.ll
  llvm/test/CodeGen/X86/vec_saddo.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D72302.236444.patch
Type: text/x-patch
Size: 115264 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200106/2cff9ce4/attachment.bin>


More information about the llvm-commits mailing list