[llvm] [InstCombine] Transform high latency, dependent FSQRT/FDIV into FMUL (PR #87474)
Sushant Gokhale via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 22 04:21:12 PDT 2024
================
@@ -1796,6 +1878,64 @@ static Instruction *foldFDivSqrtDivisor(BinaryOperator &I,
return BinaryOperator::CreateFMulFMF(Op0, NewSqrt, &I);
}
+// Change
+// X = 1/sqrt(a)
+// R1 = X * X
+// R2 = a * X
+//
+// TO
+//
+// Tmp1 = 1/a
+// Tmp2 = sqrt(a)
+// Tmp3 = Tmp1 * Tmp2
+// Replace Uses Of R1 With Tmp1
+// Replace Uses Of R2 With Tmp2
+// Replace Uses Of X With Tmp3
+static Value *convertFSqrtDivIntoFMul(CallInst *CI, Instruction *X,
+ ArrayRef<Instruction *> R1,
+ ArrayRef<Instruction *> R2, Value *SqrtOp,
+ InstCombiner::BuilderTy &B) {
+
+ B.SetInsertPoint(X);
+
+ // Every instance of R1 may have different fpmath metadata and fpmath flags.
+ // We try to preserve them by having seperate fdiv instruction per R1
+ // instance.
+ Instruction *Tmp1;
----------------
sushgokh wrote:
Tmp1/Tmp2 is being used for synthesizing Tmp3. So, I need to keep it the way as it is now. But if there is better way of doing this, sure I can do it
https://github.com/llvm/llvm-project/pull/87474
More information about the llvm-commits
mailing list