[llvm] [InstCombine] Fold out-of-range bits for squaring signed integers (PR #153484)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 20 11:10:19 PDT 2025
================
@@ -409,10 +409,24 @@ static void computeKnownBitsMul(const Value *Op0, const Value *Op1, bool NSW,
}
bool SelfMultiply = Op0 == Op1;
- if (SelfMultiply)
+ if (SelfMultiply) {
SelfMultiply &=
isGuaranteedNotToBeUndef(Op0, Q.AC, Q.CxtI, Q.DT, Depth + 1);
- Known = KnownBits::mul(Known, Known2, SelfMultiply);
+
+ Known = KnownBits::mul(Known, Known2, SelfMultiply);
+
+ unsigned SignBits = ComputeNumSignBits(Op0, DemandedElts, Q, Depth + 1);
+ unsigned TyBits = Op0->getType()->getScalarSizeInBits();
+ unsigned OutValidBits = 2 * (TyBits - SignBits + 1);
+
+ if (OutValidBits < TyBits) {
+ APInt KnownZeroMask =
+ APInt::getHighBitsSet(TyBits, TyBits - OutValidBits + 1);
+ Known.Zero |= KnownZeroMask;
+ }
+ } else {
+ Known = KnownBits::mul(Known, Known2, SelfMultiply);
----------------
Aethezz wrote:
After testing, I found that adding the extra `if (SelfMultiply) {}` causes the update_test_checks.py run to produce unfolded IR. I think it has to do with the previous `if (SelfMultiply)` where `SelfMultiply &= isGuaranteedNotToBeUndef(Op0, Q.AC, Q.CxtI, Q.DT, Depth + 1);` returns false. This causes our logic to not run since SelfMultiply is now false.
https://github.com/llvm/llvm-project/pull/153484
More information about the llvm-commits
mailing list