[llvm-dev] Adding support for self-modifying branches to LLVM?

Tue Jan 19 21:04:00 PST 2016

On Tue, Jan 19, 2016 at 9:40 AM, Jonas Wagner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi,
>
> I’m thinking about using LLVM to implement a limited form of
> self-modifying code. Before diving into that, I’d like to get some feedback
> from you all.
>
> *The goal:* I’d like to add “optional” code to a program that I can
> enable at runtime and that has zero (i.e., as close to zero as I can get)
> overhead when not enabled.
>
> *Existing solutions:* Currently, I can guard optional code using a
> branch, something like br i1 %cond, label %optional, label %skip, !prof !0.
> Branch weights ensure that the branch is predicted correctly. The overhead
> of this is not as low as I’d like,
>

How low would you like it? What use case is suffering for performance due
to the branch?

Self-modifying code for truly zero-overhead (when not enabled)
instrumentation is a real thing (look at e.g. DTrace pid provider) but
unless the number of instrumentation point is very large (100's of
thousands? millions?) or not known beforehand (both are true for DTrace),
the cost of a branch will be negligible.

AFAIK, the cost of a well-predicted, not-taken branch is the same as a nop
on every x86 made in the last many years. See
http://www.agner.org/optimize/instruction_tables.pdf
Generally speaking a correctly-predicted not-taken branch is basically
identical to a nop, and a correctly-predicted taken branch is has an extra
overhead similar to an "add" or other extremely cheap operation. More
concerning is that the condition that is branched on is probably some flag
in memory somewhere and will require a memory operation to check it (but of
course on a good OoO w/ speculative execution this doesn't hold up anything
but the retire queue).

-- Sean Silva

> though, because the branch is still present in the code and because
> computing %cond also has some cost.
>
> *The idea:* I’d like to have a branch that is the same as the example
> above, but that gets translated into a nop instruction. Preferably some
> unique nop that I can easily recognize in the binary, and that has the
> same size as an unconditional branch instruction. Then, I could use a
> framework such as DynInst to replace that nop with an unconditional
> branch instruction at run-time.
>
> My questions to the community would be:
>
>    - Does the idea make sense, or am I missing a much simpler approach?
>    - What would be the easiest way to obtain the desired binary? Adding a
>    new TerminatorInstruction sounds daunting, is there something simpler?
>
> I also wonder whether I could even expects speedups from this? Are nop
> instructions actually cheaper than branches? Would modifying the binary at
> run-time play well enough with caches etc.? These are probably not
> questions for the LLVM mailing list, but if anybody has good answers they
> are welcome.
>
> Looking forward to hearing your thoughts,
> Jonas
> 
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160119/10a4e683/attachment.html>