[llvm] [InstCombine] Fold out-of-range bits for squaring signed integers (PR #153484)

via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 20 11:10:19 PDT 2025


================
@@ -409,10 +409,24 @@ static void computeKnownBitsMul(const Value *Op0, const Value *Op1, bool NSW,
   }
 
   bool SelfMultiply = Op0 == Op1;
-  if (SelfMultiply)
+  if (SelfMultiply) {
     SelfMultiply &=
         isGuaranteedNotToBeUndef(Op0, Q.AC, Q.CxtI, Q.DT, Depth + 1);
-  Known = KnownBits::mul(Known, Known2, SelfMultiply);
+
+    Known = KnownBits::mul(Known, Known2, SelfMultiply);
+
+    unsigned SignBits = ComputeNumSignBits(Op0, DemandedElts, Q, Depth + 1);
+    unsigned TyBits = Op0->getType()->getScalarSizeInBits();
+    unsigned OutValidBits = 2 * (TyBits - SignBits + 1);
+
+    if (OutValidBits < TyBits) {
+      APInt KnownZeroMask =
+          APInt::getHighBitsSet(TyBits, TyBits - OutValidBits + 1);
+      Known.Zero |= KnownZeroMask;
+    }
+  } else {
+    Known = KnownBits::mul(Known, Known2, SelfMultiply);
----------------
Aethezz wrote:

After testing, I found that adding the extra `if (SelfMultiply) {}` causes the update_test_checks.py run to produce unfolded IR. I think it has to do with the previous `if (SelfMultiply)` where `SelfMultiply &= isGuaranteedNotToBeUndef(Op0, Q.AC, Q.CxtI, Q.DT, Depth + 1);` returns false. This causes our logic to not run since SelfMultiply is now false. 

https://github.com/llvm/llvm-project/pull/153484


More information about the llvm-commits mailing list