[PATCH] D86726: [InstCombine]: Transform 1.0/sqrt(X) * X to X/sqrt(X) and X * 1.0/sqrt(X) to X/sqrt(X)

Venkataramanan Kumar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 31 11:13:23 PDT 2020


venkataramanan.kumar.llvm added inline comments.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp:553-558
+    if (match(Op0, (m_FDiv(m_SpecificFP(1.0), m_Value(Y)))) &&
+        match(Y, m_Intrinsic<Intrinsic::sqrt>(m_Value(X))) && Op1 == X)
+      return BinaryOperator::CreateFDivFMF(X, Y, &I);
+    if (match(Op1, (m_FDiv(m_SpecificFP(1.0), m_Value(Y)))) &&
+        match(Y, m_Intrinsic<Intrinsic::sqrt>(m_Value(X))) && Op0 == X)
+      return BinaryOperator::CreateFDivFMF(X, Y, &I);
----------------
spatel wrote:
> We should predicate this fold using the same FMF that the backend is using. Currently at least, we are requiring "nsz", so that should be checked here.
When x= -0.0  then 1.0/sqrt(-0,0) *  -0.0  = NAN = -0.0/sqrt(-0.0).   So the transform as such don't require "nsz" check right? 

But yes then the folding will not happen in the back end when we  have -0.0 .  I will add the nsz check in the next revision of this patch. 
 


================
Comment at: llvm/test/Transforms/InstCombine/fmul-sqrt.ll:110-112
   %sqrt = call fast double @llvm.sqrt.f64(double %x)
   %rsqrt = fdiv fast double 1.0, %sqrt
   %res = fmul reassoc double %rsqrt, %x
----------------
spatel wrote:
> This test could provide better coverage for the logic as implemented. The only FMF requirement is the "reassoc" (and likely "nsz") on the fmul, and that's typical of our FP folds so far. 
> 
> If we want to make that more conservative by requiring FMF on the fdiv or fsqrt too, this test would then offer a potential counter-example.
> 
> There's some discussion about the semantics/subtlety of FMF in:
> https://llvm.org/PR46326
is there any restriction that FMF should be set on all the operands we try to re-associate ?  

When I read the PR it seems there is no such rule.  

For the patch I will add "nsz" to fmul.  It already has the "fast" setting for other operations.  is that the correct understanding?



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86726/new/

https://reviews.llvm.org/D86726



More information about the llvm-commits mailing list