[PATCH] D80706: [DAGCombine] Add hook to allow target specific test for sqrt input

Wed Nov 4 20:12:24 PST 2020

shchenz accepted this revision.
shchenz added a comment.
This revision is now accepted and ready to land.

LGTM. Thanks for improving this.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:12398
+  // Otherwise fe_flag is set to 0.
+  SDValue SRIdxVal = DAG.getTargetConstant(PPC::sub_eq, DL, MVT::i32);
+  return SDValue(DAG.getMachineNode(TargetOpcode::EXTRACT_SUBREG, DL, MVT::i1,
----------------
steven.zhang wrote:
> shchenz wrote:
> > steven.zhang wrote:
> > > shchenz wrote:
> > > > Target independent code checks denormal input, `ftsqrt` denormal input checking result is in `fg_flag`, your comments seems like related to fe_flag, does this matters?
> > > Target independent code didn't assume the content of the check and the target is free to do the kind of check. The ISA contains a program note:
> > > ```
> > > ftdiv and ftsqrt are provided to accelerate software
> > > emulation of divide and square root operations, by
> > > performing the requisite special case checking.
> > > Software needs only a single branch, on FE=1 (in
> > > CR[BF]), to a special case handler. FG and FL may
> > > provide further acceleration opportunities.
> > > ```
> > > So, I select the FE for the special case handle.
> > I think there is functionality issue here if we use `fe_flag`, not `fg_flag`. From the comments in target independent code:
> > ```
> >  // The estimate is now completely wrong if the input was exactly 0.0 or
> >  // possibly a denormal. Force the answer to 0.0 for those cases.
> > ```
> > 
> > The iteration method to calculate the sqrt would be wrong if the input if denormal.
> > 
> > But in PowerPC's hook implementation, `fe_flag` will not be set even if the input is denormal. So now for denormal input, we may also use the newton iterated `est` after testing `fe_flag`.
> According to http://web.mit.edu/hyperbook/Patrikalakis-Maekawa-Cho/node47.html, the double floating point is denormal if exp < -1022. So, the ftsqrt must return 1 as it is set if e_b <= -970. That means we won't have functionality issue but with precision issue for the value between exp >= -1022 ~ exp <= -970, which is handled by D80974
Thanks for your explanation. So if flag fg_flag is 1, fe_flag must be also 1. For the normal input cases where fe_flag is 1, but fg_flag is 0, you handle them in D80974. This makes sense to me.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80706/new/

https://reviews.llvm.org/D80706