[llvm-dev] [RFC] Order File Instrumentation

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Thu Jan 17 10:53:02 PST 2019

Hi Manman,

Ordering profiling is certainly something very useful to have to startup
time performance. GCC has something similar.

In terms of implementation, it is possible to simply extend the edge
profiling counters by 1 for each function, and instrument the function to
record the time stamp the first time the function is executed. The overhead
will be minimized and you can leverage all the other existing support in
profiling runtime.

Another possibility is to use xray to implement the functionality -- xray
is useful for trace like profiling by design.


On Thu, Jan 17, 2019 at 10:24 AM Manman Ren <manman.ren at gmail.com> wrote:

> Order file is used to teach ld64 how to order the functions in a binary.
> If we put all functions executed during startup together in the right
> order, we will greatly reduce the page faults during startup.
> To generate order file for iOS apps, we usually use dtrace, but some apps
> have various startup scenarios that we want to capture in the order file.
> dtrace approach is not easy to automate, it is hard to capture the
> different ways of starting an app without automation. Instrumented builds
> however can be deployed to phones and profile data can be automatically
> collected.
> For the Facebook app, by looking at the startup distribution, we are
> expecting a big win out of the order file instrumentation, from 100ms to
> 500ms+, in startup time.
> The basic idea of the pass is to use a circular buffer to log the
> execution ordering of the functions. We only log the function when it is
> first executed. Instead of logging the symbol name of the function, we log
> a pair of integers, with one integer specifying the module id, and the
> other specifying the function id within the module.
> In this pass, we add three global variables:
> (1) an order file buffer
> The order file buffer is a circular buffer at its own llvm section. Each
> entry is a pair of integers, with one integer specifying the module id, and
> the other specifying the function id within the module.
> (2) a bitmap for each module: one bit for each function to say if the
> function is already executed;
> (3) a global index to the buffer
> At the function prologue, if the function has not been executed (by
> checking the bitmap), log the module id and the function id, then
> atomically increase the index.
> This pass is intended to be used as a ThinLTO pass or a LTO pass. It maps
> each module to a distinct integer, it also generate a mapping file so we
> can decode the function symbol name from the pair of ids.
> clang has '-finstrument-function-entry-bare' which inserts a function call
> and is not as efficient.
> Three patches are attached, for llvm, clang, and compiler-rt respectively.
> (1) Migrate to the new pass manager with a shim for the legacy pass
> manager.
> (2) For the order file buffer, consider always emitting definitions,
> making them LinkOnceODR with a COMDAT group.
> (3) Add testing case for clang/compiler-rt patches.
> (4) Add utilities to deobfuscate the profile dump.
> (5) The size of the buffer is currently hard-coded (
> Thanks Kamal for contributing to the patches! Thanks to Aditya and Saleem
> for doing an initial review pass over the patches!
> Manman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190117/6e6d58c7/attachment.html>

More information about the llvm-dev mailing list