[llvm-dev] [RFC] Adding Binary ID into LLVM Profiles

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Thu Jun 24 14:39:12 PDT 2021


Hi Gulfem, current profile matching scheme supports function level
mis-match detection which is at a finer level of granularity as the
executable level build-id. What is the use case of this level of
identification?

David

On Mon, Jun 14, 2021 at 5:47 PM Gulfem Savrun Yeniceri via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Motivation
>
> There is no direct way of associating binaries with the corresponding
> profiles in LLVM. Therefore, source code coverage processing requires an
> additional post-processing step to match the executables to their
> associated profiles. In order to improve it, we propose embedding binary
> IDs into profiles, so that we can uniquely identify a profile and easily
> find the relevant binary.
>
> Background
> Binary ID
>
> We use the name binary ID to refer to the unique identifiers used in
> binaries in different file formats. Build ID
> <https://fedoraproject.org/wiki/Releases/FeatureBuildId> is a unique
> identifier for the build that is included in the ELF file format. It was
> originally introduced in GNU, and is used for various purposes, such as
> assoicating binaries with core dumps. Build ID is optional, and can be
> enabled by using -Wl,--build-id options. To the best of our knowledge,
> similar unique identifiers are used in different file formats. For example,
> a unique identifier called LC_UUID is used in Mach-O, and similarly GUID (Globally
> Unique Identifier) is used in COFF.
>
> Profiling
>
> Clang supports profiling with instrumentation
> <https://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation>
> for two main purposes:
>
>    1.
>
>    Front-end instrumentation, where the compiler front-end inserts
>    instrumentation for collecting source code coverage.
>    2.
>
>    IR-level instrumentation, where LLVM inserts instrumentation during
>    optimizations for PGO (Profile-Guided Optimization).
>
>
> Profiling inserts instrumentation code into binaries, which will be used
> by compiler_rt (compiler runtime) during execution. When the instrumented
> binary executes, it will write a raw profile (.profraw). Multiple raw
> profiles are merged together by using llvm-profdata
> <https://llvm.org/docs/CommandGuide/llvm-profdata.html> tool. At the end,
> a single indexed profile is created (.profdata) that is used to generate
> source code coverage reports.
>
> Profile format consists of two major parts:
>
>    1.
>
>    Profile header includes version, magic (and paddings and sizes of each
>    section in raw profile).
>    2.
>
>    Profile data includes function name and hash, and pointers to three
>    sections: counters, names and value profiling counters per function.
>
>
> Proposal
>
> We propose adding build ID, which is the unique binary ID in ELF, into
> profiles to improve source-code coverage post-processing step. Although we
> target ELF file format, we are proposing a design that can be leveraged and
> extended for other file formats, such as Mach-O and COFF.
> Extending profile format
>
> We need to extend the both raw and indexed profile format to include build
> ID. Since build ID does not have a fixed length,  we will add a
> variable-length byte array at the end of profile formats. We will also
> change the compiler-rt profiling runtime for ELF platforms to read build
> IDs from ELF data in memory and write them into the raw profile.
> Extending profiling tools
>
> Since the profile format changes, we also need to extend the tools that
> process profiles. We need to extend the ProfileData library functions
> that llvm-profdata tool uses to operate on profiles, and add support for
> printing binary ids in the profiles.
>
> Future Work
>
> Embedding binary ids into profiles would also enable implementing support
> for debuginfod <https://sourceware.org/elfutils/Debuginfod.html> library
> in llvm-cov
> <https://lists.llvm.org/pipermail/llvm-dev/2020-August/144708.html>,
> where the tool will automatically download binaries corresponding to input
> profile.
>
> References
>
> - https://fedoraproject.org/wiki/Releases/FeatureBuildId
>
> -
> https://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation
>
> - https://llvm.org/docs/CommandGuide/llvm-profdata.html
>
> - https://lists.llvm.org/pipermail/llvm-dev/2020-August/144708.html
>
> - https://sourceware.org/elfutils/Debuginfod.html
>
>
>
> Please let us know if you have any suggestions or questions.
>
>
> Thanks,
>
>
> Gülfem
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210624/d438618a/attachment.html>


More information about the llvm-dev mailing list