[llvm] [InstCombine] Transform high latency, dependent FSQRT/FDIV into FMUL (PR #87474)
Sushant Gokhale via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 3 00:04:23 PDT 2024
================
@@ -626,6 +626,88 @@ Instruction *InstCombinerImpl::foldPowiReassoc(BinaryOperator &I) {
return nullptr;
}
+// Check legality for transforming
+// x = 1.0/sqrt(a)
+// r1 = x * x;
+// r2 = a/sqrt(a);
+//
+// TO
+//
+// r1 = 1/a
+// r2 = sqrt(a)
+// x = r1 * r2
+static bool isFSqrtDivToFMulLegal(Instruction *X, ArrayRef<Instruction *> R1,
+ ArrayRef<Instruction *> R2) {
+ BasicBlock *BBx = X->getParent();
+ BasicBlock *BBr1 = R1[0]->getParent();
+ BasicBlock *BBr2 = R2[0]->getParent();
+
+ CallInst *FSqrt = cast<CallInst>(X->getOperand(1));
+ if (!FSqrt->hasAllowReassoc() || !FSqrt->hasNoNaNs() ||
+ !FSqrt->hasNoSignedZeros() || !FSqrt->hasNoInfs())
----------------
sushgokh wrote:
The optimization is valid only for positive normals. Now, we cant put restrictions on values of `a` so I had to put constraints on call instruction since this is used for all x/r1/r2. Also, x/r1/r2 can have multiple uses and hence, their values before/after transform need to be matched
https://github.com/llvm/llvm-project/pull/87474
More information about the llvm-commits
mailing list