[PATCH] D22975: Compute the Newton series natively

Tue Aug 9 16:33:15 PDT 2016

spatel added a comment.

Side note regarding select folding with -1/0: inspired by this patch, I filed PR28895 ( https://llvm.org/bugs/show_bug.cgi?id=28895 ).
There are a few different paths and optimizations for this in x86. Some of it (eg,  https://reviews.llvm.org/D23337 ) could be lifted to generic DAG combiner I think.

When I looked at how AArch64 handled the case in PR28895, I noticed that the select always get cracked into and/andn/or and then re-matched into a vbsl. That seems like better general policy than what x86 is doing (matching to ISD::VSELECT early).

Regardless of all that, we really do want to avoid vblendv on x86 in this patch. As Simon hinted, some cores suffer greatly because vblendv is cracked into the base logic ops (and/andn/or) by the HW, and so that instruction has 3 times worse latency/throughput than a simple op.

Repository:
  rL LLVM

https://reviews.llvm.org/D22975