[llvm-dev] Adding support for self-modifying branches to LLVM?
Sean Silva via llvm-dev
llvm-dev at lists.llvm.org
Thu Jan 21 13:51:39 PST 2016
On Thu, Jan 21, 2016 at 1:33 PM, Philip Reames <listmail at philipreames.com>
wrote:
>
>
> On 01/19/2016 09:04 PM, Sean Silva via llvm-dev wrote:
>
>
> AFAIK, the cost of a well-predicted, not-taken branch is the same as a nop
> on every x86 made in the last many years. See
> http://www.agner.org/optimize/instruction_tables.pdf
> <http://www.agner.org/optimize/instruction_tables.pdf>
> Generally speaking a correctly-predicted not-taken branch is basically
> identical to a nop, and a correctly-predicted taken branch is has an extra
> overhead similar to an "add" or other extremely cheap operation.
>
> Specifically on this point only: While absolutely true for most
> micro-benchmarks, this is less true at large scale. I've definitely seen
> removing a highly predictable branch (in many, many places, some of which
> are hot) to benefit performance in the 5-10% range. For instance, removing
> highly predictable branches is the primary motivation of implicit null
> checking. (http://llvm.org/docs/FaultMaps.html). Where exactly the
> performance improvement comes from is hard to say, but, empirically, it
> does matter.
>
> (Caveat to above: I have not run an experiment that actually put in the
> same number of bytes in nops. It's possible the entire benefit I mentioned
> is code size related, but I doubt it given how many ticks a sample profiler
> will show on said branches.)
>
Interesting. Another possible explanation is that these extra branches
cause contention on branch-prediction resources. In the past when talking
with Dan about WebAssembly sandboxing, IIRC he said that they found about
15% overhead, due primarily to branch-prediction resource contention. In
fact I think they had a pretty clear idea of wanting a new instruction
which is just a "statically predict never taken and don't use any
branch-prediction resources" branch (this is on x86 IIRC; some arches
actually obviously have such an instruction!).
-- Sean Silva
>
> p.s. Sean mentions down-thread that most of the slowdown from checks is in
> the effect on the optimizer, not the direct impact of the instructions
> emitted. This is absolutely our experience as well. I don't intend for
> anything I said above to imply otherwise.
>
> Philip
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160121/2d6db705/attachment.html>
More information about the llvm-dev
mailing list