[PATCH] D22975: Compute the Newton series natively
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 9 16:33:15 PDT 2016
spatel added a comment.
Side note regarding select folding with -1/0: inspired by this patch, I filed PR28895 ( https://llvm.org/bugs/show_bug.cgi?id=28895 ).
There are a few different paths and optimizations for this in x86. Some of it (eg, https://reviews.llvm.org/D23337 ) could be lifted to generic DAG combiner I think.
When I looked at how AArch64 handled the case in PR28895, I noticed that the select always get cracked into and/andn/or and then re-matched into a vbsl. That seems like better general policy than what x86 is doing (matching to ISD::VSELECT early).
Regardless of all that, we really do want to avoid vblendv on x86 in this patch. As Simon hinted, some cores suffer greatly because vblendv is cracked into the base logic ops (and/andn/or) by the HW, and so that instruction has 3 times worse latency/throughput than a simple op.
Repository:
rL LLVM
https://reviews.llvm.org/D22975
More information about the llvm-commits
mailing list