[llvm] [CodeGenPrepare] Unfold slow ctpop when used in power-of-two test (PR #102731)

Sergei Barannikov via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 21 05:11:58 PDT 2025


================
@@ -1762,6 +1763,61 @@ bool CodeGenPrepare::combineToUSubWithOverflow(CmpInst *Cmp,
   return true;
 }
 
+// Decanonicalizes icmp+ctpop power-of-two test if ctpop is slow.
+bool CodeGenPrepare::unfoldPow2Test(CmpInst *Cmp) {
+  CmpPredicate Pred;
+  Value *X;
+  const APInt *C;
+
+  // (icmp (ctpop x), c)
+  if (!match(Cmp, m_ICmp(Pred, m_Intrinsic<Intrinsic::ctpop>(m_Value(X)),
+                         m_APIntAllowPoison(C))))
+    return false;
+
+  // This transformation increases the number of instructions, don't do it if
+  // ctpop is fast.
+  Type *OpTy = X->getType();
+  if (TLI->isCtpopFast(TLI->getValueType(*DL, OpTy)))
+    return false;
+
+  // ctpop(x) u< 2 -> (x & (x - 1)) == 0
+  // ctpop(x) u> 1 -> (x & (x - 1)) != 0
+  // Also handles ctpop(x) == 1 and ctpop(x) != 1 if ctpop(x) is known non-zero.
+  if ((Pred == CmpInst::ICMP_ULT && *C == 2) ||
+      (Pred == CmpInst::ICMP_UGT && *C == 1) ||
+      (ICmpInst::isEquality(Pred) && *C == 1 &&
+       isKnownNonZero(Cmp->getOperand(0), *DL))) {
+    IRBuilder<> Builder(Cmp);
+    Value *Sub = Builder.CreateAdd(X, Constant::getAllOnesValue(OpTy));
+    Value *And = Builder.CreateAnd(X, Sub);
+    CmpInst::Predicate NewPred =
+        (Pred == CmpInst::ICMP_ULT || Pred == CmpInst::ICMP_EQ)
+            ? CmpInst::ICMP_EQ
+            : CmpInst::ICMP_NE;
+    Value *NewCmp =
+        Builder.CreateICmp(NewPred, And, ConstantInt::getNullValue(OpTy));
+    Cmp->replaceAllUsesWith(NewCmp);
+    RecursivelyDeleteTriviallyDeadInstructions(Cmp);
+    return true;
+  }
+
+  // ctpop(x) == 1 -> (x ^ (x - 1)) u> (x - 1)
+  // ctpop(x) != 1 -> (x ^ (x - 1)) u<= (x - 1)
+  if (ICmpInst::isEquality(Pred) && *C == 1) {
----------------
s-barannikov wrote:

Middle end doesn't do this. [adjustIsPower2Test](https://github.com/llvm/llvm-project/pull/111284) changes `==/!= 1` --> `u< 2 / u> 1` if `ctpop(x)` is known non-zero.
CGP doesn't revisit an instruction if it was optimized once, so this `unfoldPow2Test` has to be called before `adjustIsPower2Test`. I guess these two functions can be joined to one `optimizePow2Test`, should I do that?


https://github.com/llvm/llvm-project/pull/102731


More information about the llvm-commits mailing list