[llvm-dev] [RFC] Introduce Dump Accumulator

Wed Aug 5 16:22:56 PDT 2020

I think that we should think about the relationship between this 
proposed mechanism and the existing mechanism that we have for emitting 
and capturing optimization remarks. In some sense, I feel like we 
already have a lot of this capability (e.g., llc has -remarks-section).

  -Hal

On 8/5/20 5:51 PM, Johannes Doerfert via llvm-dev wrote:
> I like the ability, not sure about the proposed implementation though.
>
> Did you consider a flag that redirects `llvm::outs()` and `llvm::errs()`
>
> into sections of the object file instead? So, you'd say:
>
>
> `clang ... -mllvm -debug-only=inline ... -mllvm -dump-section=.dump`
>
>
> and you'd get the regular debug output nicely ordered in the `.dump` 
> section.
>
> I mainly want to avoid even more output code in the passes but also be 
> able
>
> to collect at least that information. That doesn't mean we couldn't 
> add another
>
> output stream that would always/only redirect into the sections.
>
>
> ~ Johannes
>
>
> On 8/5/20 5:36 PM, Kazu Hirata via llvm-dev wrote:
>> Introduction
>> ============
>>
>> This RFC proposes a mechanism to dump arbitrary messages into object
>> files during compilation and retrieve them from the final executable.
>>
>> Background
>> ==========
>>
>> We often need to collect information from all object files of
>> applications.  For example:
>>
>> - Mircea Trofin needs to collect information from the function
>>    inlining pass so that he can train the machine learning model with
>>    the information.
>>
>> - I sometimes need to dump messages from optimization passes to see
>>    where and how they trigger.
>>
>> Now, this process becomes challenging when we build large applications
>> with a build system that caches and distributes compilation jobs.  If
>> we were to dump messages to stderr, we would have to be careful not to
>> interleave messages from multiple object files.  If we were to modify
>> a source file, we would have to flush the cache and rebuild the entire
>> application to collect dump messages from all relevant object files.
>>
>> High Level Design
>> =================
>>
>> - LLVM: We provide machinery for individual passes to dump arbitrary
>>    messages into a special ELF section in a compressed manner.
>>
>> - Linker: We simply concatenate the contents of the special ELF
>>    section.  No change is needed.
>>
>> - llvm-readobj: We add an option to retrieve the contents of the
>>    special ELF section.
>>
>> Detailed Design
>> ===============
>>
>> DumpAccumulator analysis pass
>> -----------------------------
>>
>> We create a new analysis pass called DumpAccumulator.  We add the
>> analysis pass right at the beginning of the pass pipeline.  The new
>> analysis pass holds the dump messages throughout the pass pipeline.
>>
>> If you would like to dump messages from some pass, you would obtain
>> the result of DumpAccumulator in the pass:
>>
>>    DumpAccumulator::Result *DAR = 
>> MAMProxy.getCachedResult<DumpAccumulator>(M);
>>
>> Then dump messages:
>>
>>    if (DAR) {
>>      DAR->Message += "Processing ";
>>      DAR->Message += F.getName();
>>      DAR->Message += "\n";
>>    }
>>
>> AsmPrinter
>> ----------
>>
>> We dump the messages from DumpAccumulator into a section called
>> ".llvm_dump" in a compressed manner.  Specifically, the section
>> contains:
>>
>> - LEB128 encoding of the original size in bytes
>> - LEB128 encoding of the compressed size in bytes
>> - the message compressed by zlib::compressed
>>
>> in that order.
>>
>> llvm-readobj
>> ------------
>>
>> We read the .llvm_dump section.  We dump each chunk of compressed data
>> one after another.
>>
>> Existing Implementation
>> =======================
>> https://reviews.llvm.org/D84473
>>
>> Future Directions
>> =================
>>
>> The proposal above does not support the ThinLTO build flow.  To
>> support that, I am thinking about putting the message as metadata in
>> the IR at the prelink stage.
>>
>> Thoughts?
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory