[llvm] [InstCombine] canonicalize sign bit checks (PR #122962)

Jacob Young via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 16 11:04:03 PST 2025


================
@@ -7025,6 +7027,66 @@ static Instruction *canonicalizeICmpBool(ICmpInst &I,
   }
 }
 
+// (icmp X, Y) --> (icmp slt/sgt X, 0/-1) iff Y is outside the signed range of X
+static ICmpInst *canonicalizeSignBitCheck(ICmpInst::Predicate Pred, Value *X,
+                                          const ConstantRange &XRange,
+                                          const ConstantRange &YRange) {
+  if (XRange.isSignWrappedSet())
+    return nullptr;
+  unsigned BitWidth = XRange.getBitWidth();
+  APInt SMin = APInt::getSignedMinValue(BitWidth);
+  APInt Zero = APInt::getZero(BitWidth);
+  auto NegResult =
+      XRange.intersectWith(ConstantRange(SMin, Zero), ConstantRange::Signed)
+          .icmpOrInverse(Pred, YRange);
+  if (!NegResult)
+    return nullptr;
+  auto PosResult =
+      XRange.intersectWith(ConstantRange(Zero, SMin), ConstantRange::Signed)
+          .icmpOrInverse(Pred, YRange);
+  if (!PosResult)
+    return nullptr;
+  assert(NegResult != PosResult &&
+         "Known result should been simplified already.");
+  Type *Ty = X->getType();
+  if (*NegResult)
+    return new ICmpInst(ICmpInst::ICMP_SLT, X, ConstantInt::getNullValue(Ty));
+  return new ICmpInst(ICmpInst::ICMP_SGT, X, ConstantInt::getAllOnesValue(Ty));
+}
----------------
jacobly0 wrote:

I do think it would be possible to update the downstream combines to handle a different canonical form.  My main worry is losing out in the backend on various shifting tricks and simpler sign checks that don't require loading a potentially large constant and which are localized to only needing to check one bit for large integers.  I'm sure this could be mitigated by moving the range checks and sign bit check preference into the backends that care, but it certainly does not appear to currently be the case.

Under the assumption that the issue is that more information could be gained by moving the comparison constant to one of the extremes, I'm still unclear on how to choose one side of the range as more beneficial.  It seems like some information is lost on each side, by reducing the known range on either a true or false branch, which is only gained back by recomputing ranges.  At that point, the choice of constant does not seem to affect the information you have, because the source range is always split into the same two ranges by any choice of constant.

If you want to limit it to only changing the constant of an existing icmp and not changing the signedness of the predicate, then that seems perfectly reasonable and potentially simple to implement.  If you have a specific definition of canonical other than preferring strict sign bit checks, I could also try to see if I could make it work.

https://github.com/llvm/llvm-project/pull/122962


More information about the llvm-commits mailing list