[llvm-dev] [RFC] Introduce Dump Accumulator

Wed Aug 5 15:51:55 PDT 2020

I like the ability, not sure about the proposed implementation though.

Did you consider a flag that redirects `llvm::outs()` and `llvm::errs()`

into sections of the object file instead? So, you'd say:

`clang ... -mllvm -debug-only=inline ... -mllvm -dump-section=.dump`

and you'd get the regular debug output nicely ordered in the `.dump` 
section.

I mainly want to avoid even more output code in the passes but also be able

to collect at least that information. That doesn't mean we couldn't add 
another

output stream that would always/only redirect into the sections.

~ Johannes

On 8/5/20 5:36 PM, Kazu Hirata via llvm-dev wrote:
> Introduction
> ============
>
> This RFC proposes a mechanism to dump arbitrary messages into object
> files during compilation and retrieve them from the final executable.
>
> Background
> ==========
>
> We often need to collect information from all object files of
> applications.  For example:
>
> - Mircea Trofin needs to collect information from the function
>    inlining pass so that he can train the machine learning model with
>    the information.
>
> - I sometimes need to dump messages from optimization passes to see
>    where and how they trigger.
>
> Now, this process becomes challenging when we build large applications
> with a build system that caches and distributes compilation jobs.  If
> we were to dump messages to stderr, we would have to be careful not to
> interleave messages from multiple object files.  If we were to modify
> a source file, we would have to flush the cache and rebuild the entire
> application to collect dump messages from all relevant object files.
>
> High Level Design
> =================
>
> - LLVM: We provide machinery for individual passes to dump arbitrary
>    messages into a special ELF section in a compressed manner.
>
> - Linker: We simply concatenate the contents of the special ELF
>    section.  No change is needed.
>
> - llvm-readobj: We add an option to retrieve the contents of the
>    special ELF section.
>
> Detailed Design
> ===============
>
> DumpAccumulator analysis pass
> -----------------------------
>
> We create a new analysis pass called DumpAccumulator.  We add the
> analysis pass right at the beginning of the pass pipeline.  The new
> analysis pass holds the dump messages throughout the pass pipeline.
>
> If you would like to dump messages from some pass, you would obtain
> the result of DumpAccumulator in the pass:
>
>    DumpAccumulator::Result *DAR = MAMProxy.getCachedResult<DumpAccumulator>(M);
>
> Then dump messages:
>
>    if (DAR) {
>      DAR->Message += "Processing ";
>      DAR->Message += F.getName();
>      DAR->Message += "\n";
>    }
>
> AsmPrinter
> ----------
>
> We dump the messages from DumpAccumulator into a section called
> ".llvm_dump" in a compressed manner.  Specifically, the section
> contains:
>
> - LEB128 encoding of the original size in bytes
> - LEB128 encoding of the compressed size in bytes
> - the message compressed by zlib::compressed
>
> in that order.
>
> llvm-readobj
> ------------
>
> We read the .llvm_dump section.  We dump each chunk of compressed data
> one after another.
>
> Existing Implementation
> =======================
> https://reviews.llvm.org/D84473
>
> Future Directions
> =================
>
> The proposal above does not support the ThinLTO build flow.  To
> support that, I am thinking about putting the message as metadata in
> the IR at the prelink stage.
>
> Thoughts?
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev