[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

Fri Oct 17 18:04:00 PDT 2014

> On Oct 17, 2014, at 3:54 PM, Sean Silva <chisophugis at gmail.com> wrote:
> 
> this seems like the classic situation where the someone comes to you asking for X, but what they really want is a solution to underlying problem Y, for which the best solution, once you actually analyze Y, is Z.

On the contrary, I came into this expecting to work with Eric on
parallelizing the backend, but consistently found that callback-based
RAUW traffic for metadata took almost as much CPU.

Since debug info IR is at the heart of the RAUW bottleneck, I looked
into its memory layout (it's a hog).  I started working on PR17891
because, besides improving the memory usage, the plan promised to
greatly reduce the number of nodes (indirectly reducing RAUW traffic).

In the context of `llvm-lto`, "stage 1" knocked memory usage down from
~5GB to ~3GB -- but didn't reduce the number of nodes.  Before starting
stages "2" and "3", I counted nodes and operands to find which to tackle
first.  Unfortunately, our need to reference local variables and line
table entries directly from the IR-proper limits our ability to refactor
the schema, and those are the nodes we have the most of.

This work will drop debug info memory usage in `llvm-lto` further, from
~3GB down to ~1GB.  It's also a big step toward improving debug info
maintainability.

More importantly (for me), it enables us to refactor uniquing models and
reorder serialization and linking to design away debug info RAUW
traffic -- assuming switching to use-lists doesn't drop it off the
profile.

Regarding "the bigger problem" of LTO memory usage, I expect to see more
than a 2GB drop from this work due to the nature of metadata uniquing
and expiration.  I'm not motivated to quantify it, since even a 2GB drop
-- when combined with a first-class IR and the RAUW-related speedup --
is motivation enough.

There's a lot of work left to do in LTO -- once I've finished this, I
plan to look for another bottleneck.  Not sure if I'll tackle memory
usage or performance.

As Bob suggested, please feel free to join the party!  Less work for me
to do later.