[PATCH] D72302: [X86] Improve lowering of v2i64 sign bit tests on pre-sse4.2 targets

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 6 13:20:34 PST 2020


craig.topper marked an inline comment as done.
craig.topper added inline comments.


================
Comment at: llvm/test/CodeGen/X86/bitcast-vector-bool.ll:432
 ; SSE2-SSSE3:       # %bb.0:
-; SSE2-SSSE3-NEXT:    movdqa {{.*#+}} xmm4 = [2147483648,2147483648]
-; SSE2-SSSE3-NEXT:    pxor %xmm4, %xmm3
-; SSE2-SSSE3-NEXT:    movdqa %xmm4, %xmm5
-; SSE2-SSSE3-NEXT:    pcmpgtd %xmm3, %xmm5
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2]
-; SSE2-SSSE3-NEXT:    pcmpeqd %xmm4, %xmm3
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT:    pand %xmm6, %xmm3
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3]
-; SSE2-SSSE3-NEXT:    por %xmm3, %xmm5
-; SSE2-SSSE3-NEXT:    pxor %xmm4, %xmm2
-; SSE2-SSSE3-NEXT:    movdqa %xmm4, %xmm3
-; SSE2-SSSE3-NEXT:    pcmpgtd %xmm2, %xmm3
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2]
-; SSE2-SSSE3-NEXT:    pcmpeqd %xmm4, %xmm2
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3]
-; SSE2-SSSE3-NEXT:    pand %xmm6, %xmm7
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT:    por %xmm7, %xmm2
-; SSE2-SSSE3-NEXT:    packssdw %xmm5, %xmm2
-; SSE2-SSSE3-NEXT:    pxor %xmm4, %xmm1
-; SSE2-SSSE3-NEXT:    movdqa %xmm4, %xmm3
-; SSE2-SSSE3-NEXT:    pcmpgtd %xmm1, %xmm3
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2]
-; SSE2-SSSE3-NEXT:    pcmpeqd %xmm4, %xmm1
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
-; SSE2-SSSE3-NEXT:    pand %xmm5, %xmm1
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT:    por %xmm1, %xmm3
-; SSE2-SSSE3-NEXT:    pxor %xmm4, %xmm0
-; SSE2-SSSE3-NEXT:    movdqa %xmm4, %xmm1
-; SSE2-SSSE3-NEXT:    pcmpgtd %xmm0, %xmm1
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2]
-; SSE2-SSSE3-NEXT:    pcmpeqd %xmm4, %xmm0
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3]
-; SSE2-SSSE3-NEXT:    pand %xmm5, %xmm0
-; SSE2-SSSE3-NEXT:    pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
-; SSE2-SSSE3-NEXT:    por %xmm0, %xmm1
-; SSE2-SSSE3-NEXT:    packssdw %xmm3, %xmm1
-; SSE2-SSSE3-NEXT:    packssdw %xmm2, %xmm1
-; SSE2-SSSE3-NEXT:    packsswb %xmm0, %xmm1
-; SSE2-SSSE3-NEXT:    pmovmskb %xmm1, %eax
+; SSE2-SSSE3-NEXT:    packssdw %xmm3, %xmm2
+; SSE2-SSSE3-NEXT:    packssdw %xmm1, %xmm0
----------------
The new code is simple enough that simplify demanded bits was able to get through it. The movmskb only needs the sign bits from its input and packss doesn't alter sign bits so it was able to prove the compare unnecessary.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72302/new/

https://reviews.llvm.org/D72302





More information about the llvm-commits mailing list