[PATCH] D61028: [DAGCombiner] scale repeated FP divisor by splat factor

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 23 10:14:46 PDT 2019


spatel created this revision.
spatel added reviewers: RKSimon, craig.topper, andreadb.
Herald added subscribers: hiraditya, mcrosier.
Herald added a project: LLVM.

If we have a vector FP division with a splatted divisor, we can use the existing transform that converts 'x/y' into 'x * (1.0/y)' to allow more conversions. This can then potentially be converted into a scalar FP division by existing combines (rL358984 <https://reviews.llvm.org/rL358984>) as seen in the tests here.

That can be a potentially big perf difference if scalar fdiv has better timing (including avoiding possible frequency throttling for vector ops).

There's another diff here in the ordering of the transforms - I'm proposing to move the repeated divisor transform ahead of the reciprocal estimate transform because that seems more likely to produce the best results. For default x86, we don't turn fdiv f32 into an estimate because the estimate accuracy is too poor for most code. That's probably the right perf choice for current and future CPUs since divss throughput is down to the 3-4 cycle range (Skylake/Ryzen).


https://reviews.llvm.org/D61028

Files:
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/test/CodeGen/X86/fdiv-combine-vec.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D61028.196272.patch
Type: text/x-patch
Size: 6149 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190423/5c26a1d7/attachment.bin>


More information about the llvm-commits mailing list