[PATCH] D72302: [X86] Improve lowering of v2i64 sign bit tests on pre-sse4.2 targets
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 6 13:20:34 PST 2020
craig.topper marked an inline comment as done.
craig.topper added inline comments.
================
Comment at: llvm/test/CodeGen/X86/bitcast-vector-bool.ll:432
; SSE2-SSSE3: # %bb.0:
-; SSE2-SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648]
-; SSE2-SSSE3-NEXT: pxor %xmm4, %xmm3
-; SSE2-SSSE3-NEXT: movdqa %xmm4, %xmm5
-; SSE2-SSSE3-NEXT: pcmpgtd %xmm3, %xmm5
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2]
-; SSE2-SSSE3-NEXT: pcmpeqd %xmm4, %xmm3
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT: pand %xmm6, %xmm3
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3]
-; SSE2-SSSE3-NEXT: por %xmm3, %xmm5
-; SSE2-SSSE3-NEXT: pxor %xmm4, %xmm2
-; SSE2-SSSE3-NEXT: movdqa %xmm4, %xmm3
-; SSE2-SSSE3-NEXT: pcmpgtd %xmm2, %xmm3
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2]
-; SSE2-SSSE3-NEXT: pcmpeqd %xmm4, %xmm2
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3]
-; SSE2-SSSE3-NEXT: pand %xmm6, %xmm7
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT: por %xmm7, %xmm2
-; SSE2-SSSE3-NEXT: packssdw %xmm5, %xmm2
-; SSE2-SSSE3-NEXT: pxor %xmm4, %xmm1
-; SSE2-SSSE3-NEXT: movdqa %xmm4, %xmm3
-; SSE2-SSSE3-NEXT: pcmpgtd %xmm1, %xmm3
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2]
-; SSE2-SSSE3-NEXT: pcmpeqd %xmm4, %xmm1
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
-; SSE2-SSSE3-NEXT: pand %xmm5, %xmm1
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3]
-; SSE2-SSSE3-NEXT: por %xmm1, %xmm3
-; SSE2-SSSE3-NEXT: pxor %xmm4, %xmm0
-; SSE2-SSSE3-NEXT: movdqa %xmm4, %xmm1
-; SSE2-SSSE3-NEXT: pcmpgtd %xmm0, %xmm1
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2]
-; SSE2-SSSE3-NEXT: pcmpeqd %xmm4, %xmm0
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3]
-; SSE2-SSSE3-NEXT: pand %xmm5, %xmm0
-; SSE2-SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
-; SSE2-SSSE3-NEXT: por %xmm0, %xmm1
-; SSE2-SSSE3-NEXT: packssdw %xmm3, %xmm1
-; SSE2-SSSE3-NEXT: packssdw %xmm2, %xmm1
-; SSE2-SSSE3-NEXT: packsswb %xmm0, %xmm1
-; SSE2-SSSE3-NEXT: pmovmskb %xmm1, %eax
+; SSE2-SSSE3-NEXT: packssdw %xmm3, %xmm2
+; SSE2-SSSE3-NEXT: packssdw %xmm1, %xmm0
----------------
The new code is simple enough that simplify demanded bits was able to get through it. The movmskb only needs the sign bits from its input and packss doesn't alter sign bits so it was able to prove the compare unnecessary.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D72302/new/
https://reviews.llvm.org/D72302
More information about the llvm-commits
mailing list