[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

Sat Oct 18 14:04:35 PDT 2014

> 
> On 2014 Oct 18, at 10:27, Sean Silva <chisophugis at gmail.com> wrote:
> 
> Derp. My bad. It would be nice in the future if you communicated this better in the OP. In the OP it sounds like you are doing this solely for memory, since there is no mention of CPU time or the excessive callback-based RAUW traffic.

It's clear that you found the OP misleading.  I focused this RFC on what
I thought the debug info maintainers would find most compelling.

FTR, it was there, but I admit I assumed (too much) prior familiarity
with the problem space in order to appreciate its import:

>> By leveraging the use-list
>> infrastructure for metadata operands -- i.e., only using value handles
>> for non-metadata operands -- we'll [...] increase
>> RAUW speed.

[snip]

>> 7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>>    traffic during bitcode serialization.  Now that metadata types are
>>    known, we can write debug info out in an order that makes it cheap
>>    to read back in.
>> 
>>    Note that using `MDUser` will make RAUW much cheaper, since we're
>>    using the use-list infrastructure for most of them.  If RAUW isn't
>>    showing up in a profile, I may skip this.

> On 2014 Oct 18, at 10:27, Sean Silva <chisophugis at gmail.com> wrote:
> 
>> Since debug info IR is at the heart of the RAUW bottleneck, I looked
>> into its memory layout (it's a hog).  I started working on PR17891
>> because, besides improving the memory usage, the plan promised to
>> greatly reduce the number of nodes (indirectly reducing RAUW traffic).
>> 
>> In the context of `llvm-lto`, "stage 1" knocked memory usage down from
>> ~5GB to ~3GB -- but didn't reduce the number of nodes.
> 
> Please put these numbers in context. In the OP you were talking about 15.3GB peak for llvm-lto. Why is ~5GB now the peak? Also in the OP, the theoretical improvement, out of 15.3GB, was 2GB after stage 4. How are you getting 2GB improvement out of ~5GB with only stage 1?

I'm talking variously about PR17891 and this proposal.  I suppose that
could be confusing.  "Stage 1" of PR17891 -- have a look at the PR for
context -- yielded 2.2GB reduction in peak memory usage in `llvm-lto`.
After that change, we're at 15.3GB peak in `llvm-lto`.

A conservative estimate of the allocated memory for debug info metadata,
based on counting live nodes and operands (post-change), is ~3GB.  Given
that "stage 1" of PR17891 dropped peak memory usage by 2.2GB, I assume
that the original cost was ~5GB.  This proposal drops the conservative
estimate by a further ~2GB to ~1GB.

>> As Bob suggested, please feel free to join the party!  Less work for me
>> to do later.
> 
> I'm planning on it.

Great!