[llvm] [ValueTracking] Let ComputeKnownSignBits handle (shl (zext X), C) (PR #97693)

Thu Jul 4 11:16:31 PDT 2024

================
@@ -3757,11 +3757,21 @@ static unsigned ComputeNumSignBitsImpl(const Value *V,
     }
     case Instruction::Shl: {
       const APInt *ShAmt;
+      Value *X = nullptr;
       if (match(U->getOperand(1), m_APInt(ShAmt))) {
         // shl destroys sign bits.
-        Tmp = ComputeNumSignBits(U->getOperand(0), Depth + 1, Q);
-        if (ShAmt->uge(TyBits) ||   // Bad shift.
-            ShAmt->uge(Tmp)) break; // Shifted all sign bits out.
+        if (ShAmt->uge(TyBits))
+          break; // Bad shift.
+        // We can look through a zext (more or less treating it as a sext) if
+        // all extended bits are shifted out.
+        if (match(U->getOperand(0), m_ZExt(m_Value(X))) &&
+            ShAmt->uge(TyBits - X->getType()->getScalarSizeInBits())) {
+          Tmp = ComputeNumSignBits(X, Depth + 1, Q);
----------------
bjope wrote:

I think it is related to this piece of code in InstCombinerImpl::SimplifyDemandedUseBits:

```
  case Instruction::AShr: {
    unsigned SignBits = ComputeNumSignBits(I->getOperand(0), Depth + 1, Q.CxtI);
    ...
      // If the input sign bit is known to be zero, or if none of the top bits
      // are demanded, turn this into an unsigned shift right.
      assert(BitWidth > ShiftAmt && "Shift amount not saturated?");
      APInt HighBits(APInt::getHighBitsSet(
                         BitWidth, std::min(SignBits + ShiftAmt - 1, BitWidth)));
      if (Known.Zero[BitWidth-ShiftAmt-1] ||
          !DemandedMask.intersects(HighBits)) {
        BinaryOperator *LShr = BinaryOperator::CreateLShr(I->getOperand(0),
                                                          I->getOperand(1));
        LShr->setIsExact(cast<BinaryOperator>(I)->isExact());
        LShr->takeName(I);
        return InsertNewInstWith(LShr, I->getIterator());
      }
```

And I do not fully undertstand the logic for the HigBits computation.
Given` %conv6.i = ashr exact i64 %shl.i.i, 32` , I think that BitWidth is 64. And ShifAmt is 32.
So the code is doing
`   APInt::getHighBitsSet(64, std::min(SignBits + 32 - 1, 64)`
And if we know nothing about sign bits we will set the 32 high bits. But the more we know about sign bits, the more bits will be set. With this patch we SignBits==4 so we get more bits set in HighBits.

But then the check `!DemandedMask.intersects(HighBits)` is less likely to be fulfilled. Because the intersects would be more likely to happen if we are able to compute more sign bits. So the better ComputeNumSignBits analysis we have, the less likely that we apply the optimization converting the AShr to LShr. That doesn't make sense to me!

https://github.com/llvm/llvm-project/pull/97693