[PATCH] D86726: [InstCombine]: Transform 1.0/sqrt(X) * X to X/sqrt(X) and X * 1.0/sqrt(X) to X/sqrt(X)
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 31 08:41:05 PDT 2020
spatel added a comment.
Note that we may still have backend gaps in sqrt codegen, so we will need to watch out for regressions. I tried to squash those better with:
rG716e35a0cf53 <https://reviews.llvm.org/rG716e35a0cf53e85a5fddc7ff86b79a751b1b2040>
rG1c9a09f42e5e <https://reviews.llvm.org/rG1c9a09f42e5ed66cba04700f9272ff53ea3cca86>
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp:553-558
+ if (match(Op0, (m_FDiv(m_SpecificFP(1.0), m_Value(Y)))) &&
+ match(Y, m_Intrinsic<Intrinsic::sqrt>(m_Value(X))) && Op1 == X)
+ return BinaryOperator::CreateFDivFMF(X, Y, &I);
+ if (match(Op1, (m_FDiv(m_SpecificFP(1.0), m_Value(Y)))) &&
+ match(Y, m_Intrinsic<Intrinsic::sqrt>(m_Value(X))) && Op0 == X)
+ return BinaryOperator::CreateFDivFMF(X, Y, &I);
----------------
We should predicate this fold using the same FMF that the backend is using. Currently at least, we are requiring "nsz", so that should be checked here.
================
Comment at: llvm/test/Transforms/InstCombine/fmul-sqrt.ll:110-112
%sqrt = call fast double @llvm.sqrt.f64(double %x)
%rsqrt = fdiv fast double 1.0, %sqrt
%res = fmul reassoc double %rsqrt, %x
----------------
This test could provide better coverage for the logic as implemented. The only FMF requirement is the "reassoc" (and likely "nsz") on the fmul, and that's typical of our FP folds so far.
If we want to make that more conservative by requiring FMF on the fdiv or fsqrt too, this test would then offer a potential counter-example.
There's some discussion about the semantics/subtlety of FMF in:
https://llvm.org/PR46326
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D86726/new/
https://reviews.llvm.org/D86726
More information about the llvm-commits
mailing list