[llvm] a4373f6 - [X86] Don't combine (x86cmp (trunc (movmsk (bitcast X))), 0) if the truncate discards unknown bits.

Fri Nov 19 21:51:20 PST 2021

Author: Craig Topper
Date: 2021-11-19T21:50:35-08:00
New Revision: a4373f6753fa9aa89d39fbd4ec9e273f76459a58

URL: https://github.com/llvm/llvm-project/commit/a4373f6753fa9aa89d39fbd4ec9e273f76459a58
DIFF: https://github.com/llvm/llvm-project/commit/a4373f6753fa9aa89d39fbd4ec9e273f76459a58.diff

LOG: [X86] Don't combine (x86cmp (trunc (movmsk (bitcast X))), 0) if the truncate discards unknown bits.

We have transform that tries turn a pmovmskb into movmskps/pd or
movmskps to movmskpd. This transform isn't valid if the truncate
discarded bits that might be set by the original movmsk.

We could fix this by inserting an AND after the new movmsk to discard
the equivalent of the truncated bits, but I've left that for later
patch.

Fixes PR52567.

Differential Revision: https://reviews.llvm.org/D114306

Added: 
    

Modified: 
    llvm/lib/Target/X86/X86ISelLowering.cpp
    llvm/test/CodeGen/X86/pr52567.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index dba0321d94312..17d14053d804e 100644

--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -44004,7 +44004,11 @@ static SDValue combineSetCCMOVMSK(SDValue EFLAGS, X86::CondCode &CC,
   // signbits extend down to all the sub-elements as well.
   // Calling MOVMSK with the wider type, avoiding the bitcast, helps expose
   // potential SimplifyDemandedBits/Elts cases.
-  if (Vec.getOpcode() == ISD::BITCAST) {
+  // If we looked through a truncate that discard bits, we can't do this
+  // transform.
+  // FIXME: We could do this transform for truncates that discarded bits by
+  // inserting an AND mask between the new MOVMSK and the CMP.
+  if (Vec.getOpcode() == ISD::BITCAST && NumElts <= CmpBits) {
     SDValue BC = peekThroughBitcasts(Vec);
     MVT BCVT = BC.getSimpleValueType();
     unsigned BCNumElts = BCVT.getVectorNumElements();

diff  --git a/llvm/test/CodeGen/X86/pr52567.ll b/llvm/test/CodeGen/X86/pr52567.ll
index d18efbe93abde..d2815286f8674 100644
--- a/llvm/test/CodeGen/X86/pr52567.ll
+++ b/llvm/test/CodeGen/X86/pr52567.ll
@@ -2,17 +2,16 @@
 ; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu | FileCheck %s
 
 ; The and in the test below discards half the bits from vector icmp result.
-; FIXME: The generated code is using a movmskps, but fails to discard bits
-; 2 and 3 before the testl.
+; We use a testb after a pmovmskb to examine only 8 bits.
 
 define i32 @foo(<4 x float> %arg) {
 ; CHECK-LABEL: foo:
 ; CHECK:       # %bb.0: # %bb
 ; CHECK-NEXT:    movaps {{.*#+}} xmm1 = [1.00000005E-3,1.00000005E-3,1.00000005E-3,1.00000005E-3]
 ; CHECK-NEXT:    cmpltps %xmm0, %xmm1
-; CHECK-NEXT:    movmskps %xmm1, %ecx
+; CHECK-NEXT:    pmovmskb %xmm1, %ecx
 ; CHECK-NEXT:    xorl %eax, %eax
-; CHECK-NEXT:    testl %ecx, %ecx
+; CHECK-NEXT:    testb %cl, %cl
 ; CHECK-NEXT:    sete %al
 ; CHECK-NEXT:    retq
 bb: