[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

Mon Jul 3 11:19:32 PDT 2017

2017-06-30 22:04 GMT-07:00 Sean Silva via llvm-dev <llvm-dev at lists.llvm.org>
:

>
>
> On Fri, Jun 30, 2017 at 5:54 PM, via llvm-dev <llvm-dev at lists.llvm.org>
> wrote:
>
>> Problem
>> -------
>>
>> Instrumentation for PGO and frontend-based coverage places a large amount
>> of
>> data in object files, even though the majority of this data is not needed
>> at
>> run-time. All the data is needlessly duplicated while generating
>> archives, and
>> again while linking. PGO name data is written out into raw profiles by
>> instrumented programs, slowing down the training and code coverage
>> workflows.
>>
>> Here are some numbers from a coverage + RA build of ToT clang:
>>
>>   * Size of the build directory: 4.3 GB
>>
>>   * Wall time needed to run "clang -help" with an SSD: 0.5 seconds
>>
>>   * Size of the clang binary: 725.24 MB
>>
>>   * Space wasted on duplicate name/coverage data (*.o + *.a): 923.49 MB
>>     - Size contributed by __llvm_covmap sections: 1.02 GB
>>       \_ Just within clang: 340.48 MB
>>
>
> We live with this duplication for debug info. In some sense, if the
> overhead is small compared to debug info, should we even bother (i.e., we
> assume that users accommodate debug builds, so that is a reasonable bound
> on the tolerable build directory size). (I don't know the numbers; this
> seems pretty large so maybe it is significant compared to debug info; just
> saying that looking at absolute numbers is misleading here; numbers
> compared to debug info are a closer measure to the user's perceptions)
>

>From a build directory point of view, I agree. However when deploying on
embedded device with "limited" space/memory you can strip the debug info
and keep them locally because they're not needed on the device for running
(or remote-debugging), is it the case with the profile infos?

-- 
Mehdi


>
> In fact, one overall architectural observation I have is that the most
> complicated part of all this is simply establishing the workflow to plumb
> together data emitted per-TU to a tool that needs that information to do
> some post-processing step on the results of running the binary. That sounds
> a lot like the role of debug info. In fact, having a debugger open a core
> file is precisely equivalent to what llvm-profdata needs to do in this
> regard AFAICT.
>
> So it would be best if possible to piggyback on all the effort that has
> gone into plumbing that data to make debug info work. For example, I know
> that on Darwin there's a fair amount of system-level integration to make
> split dwarf "just work" while keeping debug info out of final binaries.
>
> If there is a not-too-hacky way to piggyback on debug info, that's likely
> to be a really slick solution. For example, debug info could in principle
> (if it doesn't already) contain information about the name of each counter
> in the counter array, so in principle it would be a complete enough
> description to identify each counter.
>
> I'm not very familiar with DWARF, but I'm imagining something like
> reserving an LLVM vendor-specific DWARF opcode/attribute/whatever and then
> stick a blob of data in there. Presumably we have code somewhere in LLDB
> that is "here's a binary, find debug info for it", and in principle we
> could factor out that code and lift it into an LLVM library
> (libFindDebugInfo) that llvm-profdata could use.
>
>
>>     - Size contributed by __llvm_prf_names sections: 327.46 MB
>>       \_ Just within clang: 106.76 MB
>>
>>     => Space wasted within the clang binary: 447.24 MB
>>
>> Running an instrumented clang binary triggers a 143MB raw profile write
>> which
>> is slow even with an SSD. This problem is particularly bad for
>> frontend-based
>> coverage because it generates a lot of extra name data: however, the
>> situation
>> can also be improved for PGO instrumentation.
>>
>> Proposal
>> --------
>>
>> Place PGO name data and coverage data outside of object files. This would
>> eliminate data duplication in *.a/*.o files, shrink binaries, shrink raw
>> profiles, and speed up instrumented programs.
>>
>> In more detail:
>>
>> 1. The frontends get a new `-fprofile-metadata-dir=<path>` option. This
>> lets
>> users specify where llvm will store profile metadata. If the metadata
>> starts to
>> take up too much space, there's just one directory to clean.
>>
>> 2. The frontends continue emitting PGO name data and coverage data in the
>> same
>> llvm::Module. So does LLVM's IR-based PGO implementation. No change here.
>>
>> 3. If the InstrProf lowering pass sees that a metadata directory is
>> available,
>> it constructs a new module, copies the name/coverage data into it, hashes
>> the
>> module, and attempts to write that module to:
>>
>>   <metadata-dir>/<module-hash>.bc   (the metadata module)
>>
>> If this write operation fails, it scraps the new module: it keeps all the
>> metadata in the original module, and there are no changes from the current
>> process. I.e with this proposal we preserve backwards compatibility.
>>
>
> Based at my experience with Clang's implicit modules, I'm *extremely* wary
> of anything that might cause the compiler to emit a file that the build
> system cannot guess the name of. In fact, having the compiler emit a file
> that is not explicitly listed on the command line is basically just as bad
> in practice (in terms of feasibility of informing the build system about
> it).
>
> As a simple example, ninja simply cannot represent a dependency of this
> type, so if you delete a <metadata-dir>/<module-hash>.bc it won't know
> things need to be rebuilt (and it won't know how to clean it, etc.).
>
> So I would really strongly recommend against doing this.
>
> Again, these problems of system integration (in particular build system
> integration) are nasty, and if you can bypass this and piggyback on debug
> info then everything will "just work" because the folks that care about
> making sure that debugging "just works" already did the work for you.
> It might be more work in the short term to do the debug info approach (if
> it is feasible at all), but I can tell you based on the experience with
> implicit modules (and I'm sure you have some experience of your own) that
> there's just going to be a neverending tail of hitches and ways that things
> don't work (or work poorly) due to not having the build system / overall
> system integration right, so it will be worth it in the long run.
>
> -- Sean Silva
>
>
>>
>> 4. Once the metadata module is written, the name/coverage data are
>> entirely
>> stripped out of the original module. They are replaced by a path to the
>> metadata module:
>>
>>   @__llvm_profiling_metadata = "<metadata-dir>/<module-hash>.bc",
>>                                section "__llvm_prf_link"
>>
>> This allows incremental builds to work properly, which is an important
>> use case
>> for code coverage users. When an object is rebuilt, it gets a fresh link
>> to a
>> fresh profiling metadata file. Although stale files can accumulate in the
>> metadata directory, the stale files cannot ever be used.
>>
>> In an IDE like Xcode, since there's just one target binary per scheme,
>> it's
>> possible to clean the metadata directory by removing the modules which
>> aren't
>> referenced by the target binary.
>>
>> 5. The raw profile format is updated so that links to metadata files are
>> written
>> out in each profile. This makes it possible for all existing
>> llvm-profdata and
>> llvm-cov commands to work, seamlessly.
>>
>> The indexed profile format will *not* be updated: i.e, it will contain a
>> full
>> symbol table, and no links. This simplifies the coverage mapping reader,
>> because
>> a full symbol table is guaranteed to exist before any function records are
>> parsed. It also reduces the amount of coding, and makes it easier to
>> preserve
>> backwards compatibility :).
>>
>> 6. The raw profile reader will learn how to read links, open up the
>> metadata
>> modules it finds links to, and collect name data from those modules.
>>
>> 7. The coverage reader will learn how to read the __llvm_prf_link
>> section, open
>> up metadata modules, and lazily read coverage mapping data.
>>
>> Alternate Solutions
>> -------------------
>>
>> 1. Instead of copying name data into an external metadata module, just
>> copy the
>> coverage mapping data.
>>
>> I've actually prototyped this. This might be a good way to split up
>> patches,
>> although I don't see why we wouldn't want to tackle the name data problem
>> eventually.
>>
>> 2. Instead of emitting links to external metadata modules, modify
>> llvm-cov and
>> llvm-profdata so that they require a path to the metadata directory.
>>
>> The issue with this is that it's way too easy to read stale metadata.
>> It's also
>> less user-friendly, which hurts adoption.
>>
>> 3. Use something other than llvm bitcode for the metadata module format.
>>
>> Since we're mostly writing large binary blobs (compressed name data or
>> pre-encoded source range mapping info), using bitcode shouldn't be too
>> slow, and
>> we're not likely to get better compression with a different format.
>>
>> Bitcode is also convenient, and is nice for backwards compatibility.
>>
>> ------------------------------------------------------------
>> --------------------
>>
>> If you've made it this far, thanks for taking a look! I'd appreciate any
>> feedback.
>>
>> vedant
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170703/d776ab55/attachment.html>