[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

Fri Jun 30 22:35:28 PDT 2017

On Fri, Jun 30, 2017 at 10:25 PM, Sean Silva <chisophugis at gmail.com> wrote:

>
>
> On Fri, Jun 30, 2017 at 10:04 PM, Sean Silva <chisophugis at gmail.com>
> wrote:
>
>>
>>
>> On Fri, Jun 30, 2017 at 5:54 PM, via llvm-dev <llvm-dev at lists.llvm.org>
>> wrote:
>>
>>> Problem
>>> -------
>>>
>>> Instrumentation for PGO and frontend-based coverage places a large
>>> amount of
>>> data in object files, even though the majority of this data is not
>>> needed at
>>> run-time. All the data is needlessly duplicated while generating
>>> archives, and
>>> again while linking. PGO name data is written out into raw profiles by
>>> instrumented programs, slowing down the training and code coverage
>>> workflows.
>>>
>>> Here are some numbers from a coverage + RA build of ToT clang:
>>>
>>>   * Size of the build directory: 4.3 GB
>>>
>>>   * Wall time needed to run "clang -help" with an SSD: 0.5 seconds
>>>
>>>   * Size of the clang binary: 725.24 MB
>>>
>>>   * Space wasted on duplicate name/coverage data (*.o + *.a): 923.49 MB
>>>     - Size contributed by __llvm_covmap sections: 1.02 GB
>>>       \_ Just within clang: 340.48 MB
>>>
>>
>> We live with this duplication for debug info. In some sense, if the
>> overhead is small compared to debug info, should we even bother (i.e., we
>> assume that users accommodate debug builds, so that is a reasonable bound
>> on the tolerable build directory size). (I don't know the numbers; this
>> seems pretty large so maybe it is significant compared to debug info; just
>> saying that looking at absolute numbers is misleading here; numbers
>> compared to debug info are a closer measure to the user's perceptions)
>>
>> In fact, one overall architectural observation I have is that the most
>> complicated part of all this is simply establishing the workflow to plumb
>> together data emitted per-TU to a tool that needs that information to do
>> some post-processing step on the results of running the binary. That sounds
>> a lot like the role of debug info. In fact, having a debugger open a core
>> file is precisely equivalent to what llvm-profdata needs to do in this
>> regard AFAICT.
>>
>
> In fact, it's so equivalent that you could in principle read the actual
> counter values directly out of a core file. A core file could literally be
> used as a raw profile.
>
> E.g. you could in principle open the core in the debugger and then do:
>
> p __profd_foo
> p __profd_bar
> ...
>

Sorry, should be __profc I think (or whatever the counters are called)

-- Sean Silva

>
> (and walking vprof nodes would be more complicated but doable)
>
> I'm not necessarily advocating this literally be done; just showing that
> "everything you need is there".
>
> Note also that the debug info approach has another nice advantage in that
> it allows minimizing the runtime memory overhead for the program image to
> the absolute minimum, which is important for embedded applications. Debug
> info naturally stays out of the program image and so this problem is
> automatically solved.
>
> -- Sean Silva
>
>
>>
>> So it would be best if possible to piggyback on all the effort that has
>> gone into plumbing that data to make debug info work. For example, I know
>> that on Darwin there's a fair amount of system-level integration to make
>> split dwarf "just work" while keeping debug info out of final binaries.
>>
>> If there is a not-too-hacky way to piggyback on debug info, that's likely
>> to be a really slick solution. For example, debug info could in principle
>> (if it doesn't already) contain information about the name of each counter
>> in the counter array, so in principle it would be a complete enough
>> description to identify each counter.
>>
>> I'm not very familiar with DWARF, but I'm imagining something like
>> reserving an LLVM vendor-specific DWARF opcode/attribute/whatever and then
>> stick a blob of data in there. Presumably we have code somewhere in LLDB
>> that is "here's a binary, find debug info for it", and in principle we
>> could factor out that code and lift it into an LLVM library
>> (libFindDebugInfo) that llvm-profdata could use.
>>
>>
>>>     - Size contributed by __llvm_prf_names sections: 327.46 MB
>>>       \_ Just within clang: 106.76 MB
>>>
>>>     => Space wasted within the clang binary: 447.24 MB
>>>
>>> Running an instrumented clang binary triggers a 143MB raw profile write
>>> which
>>> is slow even with an SSD. This problem is particularly bad for
>>> frontend-based
>>> coverage because it generates a lot of extra name data: however, the
>>> situation
>>> can also be improved for PGO instrumentation.
>>>
>>> Proposal
>>> --------
>>>
>>> Place PGO name data and coverage data outside of object files. This would
>>> eliminate data duplication in *.a/*.o files, shrink binaries, shrink raw
>>> profiles, and speed up instrumented programs.
>>>
>>> In more detail:
>>>
>>> 1. The frontends get a new `-fprofile-metadata-dir=<path>` option. This
>>> lets
>>> users specify where llvm will store profile metadata. If the metadata
>>> starts to
>>> take up too much space, there's just one directory to clean.
>>>
>>> 2. The frontends continue emitting PGO name data and coverage data in
>>> the same
>>> llvm::Module. So does LLVM's IR-based PGO implementation. No change here.
>>>
>>> 3. If the InstrProf lowering pass sees that a metadata directory is
>>> available,
>>> it constructs a new module, copies the name/coverage data into it,
>>> hashes the
>>> module, and attempts to write that module to:
>>>
>>>   <metadata-dir>/<module-hash>.bc   (the metadata module)
>>>
>>> If this write operation fails, it scraps the new module: it keeps all the
>>> metadata in the original module, and there are no changes from the
>>> current
>>> process. I.e with this proposal we preserve backwards compatibility.
>>>
>>
>> Based at my experience with Clang's implicit modules, I'm *extremely*
>> wary of anything that might cause the compiler to emit a file that the
>> build system cannot guess the name of. In fact, having the compiler emit a
>> file that is not explicitly listed on the command line is basically just as
>> bad in practice (in terms of feasibility of informing the build system
>> about it).
>>
>> As a simple example, ninja simply cannot represent a dependency of this
>> type, so if you delete a <metadata-dir>/<module-hash>.bc it won't know
>> things need to be rebuilt (and it won't know how to clean it, etc.).
>>
>> So I would really strongly recommend against doing this.
>>
>> Again, these problems of system integration (in particular build system
>> integration) are nasty, and if you can bypass this and piggyback on debug
>> info then everything will "just work" because the folks that care about
>> making sure that debugging "just works" already did the work for you.
>> It might be more work in the short term to do the debug info approach (if
>> it is feasible at all), but I can tell you based on the experience with
>> implicit modules (and I'm sure you have some experience of your own) that
>> there's just going to be a neverending tail of hitches and ways that things
>> don't work (or work poorly) due to not having the build system / overall
>> system integration right, so it will be worth it in the long run.
>>
>> -- Sean Silva
>>
>>
>>>
>>> 4. Once the metadata module is written, the name/coverage data are
>>> entirely
>>> stripped out of the original module. They are replaced by a path to the
>>> metadata module:
>>>
>>>   @__llvm_profiling_metadata = "<metadata-dir>/<module-hash>.bc",
>>>                                section "__llvm_prf_link"
>>>
>>> This allows incremental builds to work properly, which is an important
>>> use case
>>> for code coverage users. When an object is rebuilt, it gets a fresh link
>>> to a
>>> fresh profiling metadata file. Although stale files can accumulate in the
>>> metadata directory, the stale files cannot ever be used.
>>>
>>> In an IDE like Xcode, since there's just one target binary per scheme,
>>> it's
>>> possible to clean the metadata directory by removing the modules which
>>> aren't
>>> referenced by the target binary.
>>>
>>> 5. The raw profile format is updated so that links to metadata files are
>>> written
>>> out in each profile. This makes it possible for all existing
>>> llvm-profdata and
>>> llvm-cov commands to work, seamlessly.
>>>
>>> The indexed profile format will *not* be updated: i.e, it will contain a
>>> full
>>> symbol table, and no links. This simplifies the coverage mapping reader,
>>> because
>>> a full symbol table is guaranteed to exist before any function records
>>> are
>>> parsed. It also reduces the amount of coding, and makes it easier to
>>> preserve
>>> backwards compatibility :).
>>>
>>> 6. The raw profile reader will learn how to read links, open up the
>>> metadata
>>> modules it finds links to, and collect name data from those modules.
>>>
>>> 7. The coverage reader will learn how to read the __llvm_prf_link
>>> section, open
>>> up metadata modules, and lazily read coverage mapping data.
>>>
>>> Alternate Solutions
>>> -------------------
>>>
>>> 1. Instead of copying name data into an external metadata module, just
>>> copy the
>>> coverage mapping data.
>>>
>>> I've actually prototyped this. This might be a good way to split up
>>> patches,
>>> although I don't see why we wouldn't want to tackle the name data problem
>>> eventually.
>>>
>>> 2. Instead of emitting links to external metadata modules, modify
>>> llvm-cov and
>>> llvm-profdata so that they require a path to the metadata directory.
>>>
>>> The issue with this is that it's way too easy to read stale metadata.
>>> It's also
>>> less user-friendly, which hurts adoption.
>>>
>>> 3. Use something other than llvm bitcode for the metadata module format.
>>>
>>> Since we're mostly writing large binary blobs (compressed name data or
>>> pre-encoded source range mapping info), using bitcode shouldn't be too
>>> slow, and
>>> we're not likely to get better compression with a different format.
>>>
>>> Bitcode is also convenient, and is nice for backwards compatibility.
>>>
>>> ------------------------------------------------------------
>>> --------------------
>>>
>>> If you've made it this far, thanks for taking a look! I'd appreciate any
>>> feedback.
>>>
>>> vedant
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170630/832d66ec/attachment.html>