[llvm] [CGP]: Optimize mul.overflow. (PR #148343)
Hassnaa Hamdi via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 17 06:29:11 PST 2025
================
@@ -6389,6 +6395,184 @@ bool CodeGenPrepare::optimizeGatherScatterInst(Instruction *MemoryInst,
return true;
}
+// This is a helper for CodeGenPrepare::optimizeMulWithOverflow.
+// Check the pattern we are interested in where there are maximum 2 uses
+// of the intrinsic which are the extract instructions.
+static bool matchOverflowPattern(Instruction *&I, ExtractValueInst *&MulExtract,
+ ExtractValueInst *&OverflowExtract) {
+ if (I->getNumUses() > 2)
+ return false;
+
+ for (User *U : I->users()) {
+ auto *Extract = dyn_cast<ExtractValueInst>(U);
+ if (!Extract || Extract->getNumIndices() != 1)
+ return false;
+
+ unsigned Index = Extract->getIndices()[0];
+ if (Index == 0)
+ MulExtract = Extract;
+ else if (Index == 1)
+ OverflowExtract = Extract;
+ else
+ return false;
+ }
+ return true;
+}
+
+// Rewrite the mul_with_overflow intrinsic by checking if both of the
+// operands' value ranges are within the legal type. If so, we can optimize the
+// multiplication algorithm. This code is supposed to be written during the step
+// of type legalization, but given that we need to reconstruct the IR which is
+// not doable there, we do it here.
+// The IR after the optimization will look like:
+// entry:
+// if signed:
+// ( (lhs_lo>>BW-1) ^ lhs_hi) || ( (rhs_lo>>BW-1) ^ rhs_hi) ? overflow,
+// overflow_no
+// else:
+// (lhs_hi != 0) || (rhs_hi != 0) ? overflow, overflow_no
+// overflow_no:
+// overflow:
+// overflow.res:
+// \returns true if optimization was applied
+// TODO: This optimization can be further improved to optimize branching on
+// overflow where the 'overflow_no' BB can branch directly to the false
+// successor of overflow, but that would add additional complexity so we leave
+// it for future work.
+bool CodeGenPrepare::optimizeMulWithOverflow(Instruction *I, bool IsSigned,
+ ModifyDT &ModifiedDT) {
+ // Check if target supports this optimization.
+ if (!TLI->shouldOptimizeMulOverflowWithZeroHighBits(
+ I->getContext(),
+ TLI->getValueType(*DL, I->getType()->getContainedType(0))))
+ return false;
+
+ ExtractValueInst *MulExtract = nullptr, *OverflowExtract = nullptr;
+ if (!matchOverflowPattern(I, MulExtract, OverflowExtract))
+ return false;
+
+ // Keep track of the instruction to stop reoptimizing it again.
+ InsertedInsts.insert(I);
+
+ Value *LHS = I->getOperand(0);
+ Value *RHS = I->getOperand(1);
+ Type *Ty = LHS->getType();
+ unsigned VTHalfBitWidth = Ty->getScalarSizeInBits() / 2;
+ Type *LegalTy = Ty->getWithNewBitWidth(VTHalfBitWidth);
+
+ // New BBs:
+ std::string OriginalBlockName = I->getParent()->getName().str();
----------------
hassnaaHamdi wrote:
I have to cache the name because if I renamed the new block by the original name, it will be numbered -`entry1`-, because the name is already used.
https://github.com/llvm/llvm-project/pull/148343
More information about the llvm-commits
mailing list