[PATCH] D80974: [DAGCombine] Adding a hook to improve the precision of fsqrt if the input is denormal

Tue Nov 10 21:05:24 PST 2020

steven.zhang requested review of this revision.
steven.zhang added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:21981
+            Test.getValueType().isVector() ? ISD::VSELECT : ISD::SELECT, DL, VT,
+            Test, TLI.getSqrtResultForDenormInput(Op, DAG), Est);
       }
----------------
shchenz wrote:
> On Powerpc target, we use 0.0 as denormal float point sqrt result for a long time. Changing the result to hardware sqrt instructions will improve the precision for sure, but it also degrades the runtime performance. Is it possible to do it like: if we concern about performance, we use 0.0, if we concern about precision, we use hardware sqrt instruction. Maybe `-Ofast` is an indicatation for this?
It won't deg the runtime performance if the input is not denormal which is the usual case, as the select is expanded as cmp + branch later and hw will predict to the normal code path. And it indeed slows down the performance if the input is denormal, but it is expected as the precision is improved. All the optimization is done under -Ofast. 

Considering that we only have impact on the denormal input code path, which usually cares about the precision, not the performance, I tend to keep it this way. Does it make sense ?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80974/new/

https://reviews.llvm.org/D80974