[PATCH] D61149: [DAGCombiner] try repeated fdiv divisor transform before building estimate

Thu Apr 25 13:55:06 PDT 2019

spatel created this revision.
spatel added reviewers: RKSimon, craig.topper.
Herald added subscribers: hiraditya, mcrosier.
Herald added a project: LLVM.

This was originally part of D61028 <https://reviews.llvm.org/D61028>, but it's an independent diff.

If we do the repeated divisor reciprocal transform before producing an estimate sequence, then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5 vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the full-precision division is only 3 cycle throughput, so that's probably the better perf default option and avoids problems from x86's inaccurate estimates.

The last 2 tests show that users still have the option to override the defaults by using the function attributes for reciprocal estimates, but we can potentially make those faster by converting vector ops (including ymm ops) to scalar math.

https://reviews.llvm.org/D61149

Files:
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/test/CodeGen/X86/fdiv-combine-vec.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D61149.196720.patch
Type: text/x-patch
Size: 5437 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190425/04433cac/attachment.bin>