[llvm] r191792 - Debug Info: remove duplication of DIEs when a DIE is part of the type system

Thu Oct 3 20:50:24 PDT 2013

On Thu, Oct 3, 2013 at 4:24 PM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
>
> On Thu, Oct 3, 2013 at 4:10 PM, Manman Ren <manman.ren at gmail.com> wrote:
>
>>
>>
>>
>> On Thu, Oct 3, 2013 at 2:06 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>>
>>>  3GB memory reduction for xalan built with lto -g (without removing DIE
>>>>> duplication, the memory usage is 7GB)
>>>>>
>>>>
>>>> OK, so you're looking at memory usage during LTO, not the size of the
>>>> resulting debug info? (that's fine, just trying to understand exactly what
>>>> metrics you're targeting)
>>>>
>>>
>>> One thought I had (and just discussed with Eric - he seemed to think it
>>> was plausible, at least):
>>>
>>> If you're interested in reducing peak memory usage of LLVM during LTO,
>>> have you considered modifying debug info generation not to cache all the
>>> compile_units before emitting them? If we emitted one CU at a time as they
>>> were created I imagine the memory footprint would be much lower.
>>>
>>
>> Yes, that is on my todo list after type uniquing is done.
>>
>
> So, as I said, I think it might be better to do that work first. It may
> have a substantial impact on cross-CU references. (see below)
>
>
>> Right now, we don't release any memory and we have layers of memory usage
>> related to debug info: MDNodes, DIEs and some data in MC layer.
>>  All these layers exist for the whole program with LTO.
>>
>>
>>> (and we could still, potentially, add the type cross-referencing by just
>>> caching the section offsets or labels, etc, of the types emitted in prior
>>> units - though that wouldn't be perfect for types that contain some members
>>> (implicit special members) in one CU and a different set of members in
>>> another CU (and it'll be a bit trickier for type units - we'd need to cache
>>> any subgraph of potentially self-referencing type units until all the
>>> signatures are resolved, then they could all be emitted))
>>>
>>> This isn't necessarily a place you should start with - I can appreciate
>>> that your current approach is probably higher value to you for lower
>>> engineering cost (and thus gets you closer sooner) but if it's still not
>>> going to be sufficient for your needs, especially if you're going to have
>>> to implement something like emit-CU-at-a-time as I describe above anyway,
>>> it might be more valuable to lay that foundation first.
>>>
>>
>> Even if we emit one CU at a time, it is still good to share the types
>> across-CU so we don't waist time constructing the type DIEs for each CU,
>> and this approach (remove DIE duplication) also reduces the raw DWARF size.
>>
>
> I agree, from an output size perspective we'll still want to share DIEs
> across CUs - but I think you might see bigger memory footprint (since
> that's the stat you're aiming at at the moment) wins from this change, but
> more importantly - I think a change like that will have a significant
> impact on a feature like the cross-CU DIE referencing you're doing here.
>
> If we emit one CU at a time, cross-CU DIE references will look very
> different - it won't be simply a matter of getting a DIE from the cross-CU
> map anymore. Likely we'll want to store a much more minimal form of the DIE
> (just it's section offset or label) in a table once the prior CU has been
> emitted. Then when a future CU is emitted and wants to reference that, we
> can lookup that remnant in the table - the whole DIE won't exist anymore.
>

With "emit-a-CU-at-a-time", we can still keep the type DIEs around since
they are going to be shared across CUs.
We can also store a minimal form of the type DIEs to further reduce the
memory footprint.

I agree that we have to change the cross-CU references when we only store a
minimal form of the type DIEs. But the change should
not be complicated, we now store a map from MDNode to DIE, and that can be
changed to map from MDNode to MinimalDIE.

"emit-a-CU-at-a-time" is a known solution to reduce memory footprint, but I
don't have a time line for when I will work on it.

>
> It seems like the bigger change should go first so we don't avoid
> discussing a lot of this design only to have to rework it substantially
> when we emit a CU-at-a-time.
>
Since we already did a lot of discussion about this design and it is not
complicated (about 30 to 40 lines of code), let's discuss
"emit-a-CU-at-a-time" when it is somebody's immediate task.

Thanks,
Manman

>
>>
>>
>>>
>>> If not, I'd still probably like to discuss the design of the cross-CU
>>> referencing work you have here to decide on the right design in a normal
>>> patch review process rather than trying to think about how the current code
>>> can be morphed into a better design.
>>>
>>
>> What other designs do you have in mind?
>>
>
> Just some of the stuff we've already been discussing such as merging the
> maps and simplifying the DwarfDebug entry points. But if you're already
> planning to do CU-at-a-time, I wonder whether it's worth discussing the
> design of cross-CU references before we do CU-at-a-time - it seems to me
> like the implementation would change quite significantly.
>
>
>>
>> Manman
>>
>>>
>>>
>>> - David
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131003/f14e75a3/attachment.html>