<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Oct 4, 2013 at 11:38 AM, Manman Ren <span dir="ltr"><<a href="mailto:manman.ren@gmail.com" target="_blank">manman.ren@gmail.com</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br></div><div class="gmail_extra"><div class="gmail_quote"><div>

On Fri, Oct 4, 2013 at 9:41 AM, Eric Christopher <span dir="ltr"><<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>></span> wrote:<br>


</div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi Manman,<br>

<br>

In reviewing this code I've found a few problems with how you're<br>

emitting the debug info for ref_addr. </blockquote><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">


It needs to be a relocatable<br>

value from the beginning of the section and while it's a reasonably<br>

straightforward change it is going to require some work to plumb I<br>

think.</blockquote><div><br></div></div><div>The issue seems to be related to handling of ref_addr in general.</div><div>So we might have problem even after this commit is reverted.<br></div><div><br></div><div>ref_addr currently uses the offset from beginning of the debug_info section, and the following code has been there since r176882 (I added it in, so it is still my bug :)<br>


</div><div><div><div>      unsigned Addr = Origin->getOffset();</div></div><div>      if (Form == dwarf::DW_FORM_ref_addr) {</div><div>        // For DW_FORM_ref_addr, output the offset from beginning of debug info</div>


<div>        // section. Origin->getOffset() returns the offset from start of the</div>


<div>        // compile unit.</div><div>        DwarfUnits &Holder = useSplitDwarf() ? SkeletonHolder : InfoHolder;</div><div>        Addr += Holder.getCUOffset(Origin->getCompileUnit());</div><div>      }</div></div>


<div>Of course, this commits start to use ref_addr much more frequently.</div></div></div></div></blockquote><div><br></div><div>Agreed. But without /any/ test coverage it's hard to guess how this may or may not break things. We should go back and have a discussion about that change (I replied to the original commit to restart the conversation there).</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">


With this and the design discussions we've been going through<br>

with it I'm going to revert the patch for now and get you a testcase.<br></blockquote><div><br></div></div><div>Did we reach any conclusion on the design discussions? It seems to me we should not gate this on implementation of type units.</div>

</div></div></div></blockquote><div><br></div><div>As I've said on this thread and IRC, I don't intend to gate this on type units.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">


<div>And this should be implemented in a similar way as this commit. </div></div></div></div></blockquote><div><br>We've bees discussing the design, I think it'll be easier to discuss with an actual patch review. Please work up a patch that addresses, as best as you understand it, the design concerns we've discussed in this thread and the bug shown, and send it out for pre-commit review.<br>

 </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>What other concerns do you have?</div><div>

<br></div><div>Thanks,</div><div>Manman</div><div><div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">


<br>

This is a good idea for work and is something I definitely want us to<br>

pursue in the long run (or immediate term depending on your<br>

priorities) so don't look at this as anything more than me needing to<br>

revert due to a bug.<br>

<br>

Thanks!<br>

<span><font color="#888888"><br>

-eric<br>

</font></span><div><div><br>

<br>

On Thu, Oct 3, 2013 at 8:50 PM, Manman Ren <<a href="mailto:manman.ren@gmail.com" target="_blank">manman.ren@gmail.com</a>> wrote:<br>

><br>

><br>

><br>

> On Thu, Oct 3, 2013 at 4:24 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>> wrote:<br>

>><br>

>><br>

>><br>

>><br>

>> On Thu, Oct 3, 2013 at 4:10 PM, Manman Ren <<a href="mailto:manman.ren@gmail.com" target="_blank">manman.ren@gmail.com</a>> wrote:<br>

>>><br>

>>><br>

>>><br>

>>><br>

>>> On Thu, Oct 3, 2013 at 2:06 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>> wrote:<br>

>>>><br>

>>>><br>

>>>>>> 3GB memory reduction for xalan built with lto -g (without removing DIE<br>

>>>>>> duplication, the memory usage is 7GB)<br>

>>>>><br>

>>>>><br>

>>>>> OK, so you're looking at memory usage during LTO, not the size of the<br>

>>>>> resulting debug info? (that's fine, just trying to understand exactly what<br>

>>>>> metrics you're targeting)<br>

>>>><br>

>>>><br>

>>>> One thought I had (and just discussed with Eric - he seemed to think it<br>

>>>> was plausible, at least):<br>

>>>><br>

>>>> If you're interested in reducing peak memory usage of LLVM during LTO,<br>

>>>> have you considered modifying debug info generation not to cache all the<br>

>>>> compile_units before emitting them? If we emitted one CU at a time as they<br>

>>>> were created I imagine the memory footprint would be much lower.<br>

>>><br>

>>><br>

>>> Yes, that is on my todo list after type uniquing is done.<br>

>><br>

>><br>

>> So, as I said, I think it might be better to do that work first. It may<br>

>> have a substantial impact on cross-CU references. (see below)<br>

>><br>

>>><br>

>>> Right now, we don't release any memory and we have layers of memory usage<br>

>>> related to debug info: MDNodes, DIEs and some data in MC layer.<br>

>>>  All these layers exist for the whole program with LTO.<br>

>>><br>

>>>><br>

>>>> (and we could still, potentially, add the type cross-referencing by just<br>

>>>> caching the section offsets or labels, etc, of the types emitted in prior<br>

>>>> units - though that wouldn't be perfect for types that contain some members<br>

>>>> (implicit special members) in one CU and a different set of members in<br>

>>>> another CU (and it'll be a bit trickier for type units - we'd need to cache<br>

>>>> any subgraph of potentially self-referencing type units until all the<br>

>>>> signatures are resolved, then they could all be emitted))<br>

>>>><br>

>>>> This isn't necessarily a place you should start with - I can appreciate<br>

>>>> that your current approach is probably higher value to you for lower<br>

>>>> engineering cost (and thus gets you closer sooner) but if it's still not<br>

>>>> going to be sufficient for your needs, especially if you're going to have to<br>

>>>> implement something like emit-CU-at-a-time as I describe above anyway, it<br>

>>>> might be more valuable to lay that foundation first.<br>

>>><br>

>>><br>

>>> Even if we emit one CU at a time, it is still good to share the types<br>

>>> across-CU so we don't waist time constructing the type DIEs for each CU, and<br>

>>> this approach (remove DIE duplication) also reduces the raw DWARF size.<br>

>><br>

>><br>

>> I agree, from an output size perspective we'll still want to share DIEs<br>

>> across CUs - but I think you might see bigger memory footprint (since that's<br>

>> the stat you're aiming at at the moment) wins from this change, but more<br>

>> importantly - I think a change like that will have a significant impact on a<br>

>> feature like the cross-CU DIE referencing you're doing here.<br>

>><br>

>> If we emit one CU at a time, cross-CU DIE references will look very<br>

>> different - it won't be simply a matter of getting a DIE from the cross-CU<br>

>> map anymore. Likely we'll want to store a much more minimal form of the DIE<br>

>> (just it's section offset or label) in a table once the prior CU has been<br>

>> emitted. Then when a future CU is emitted and wants to reference that, we<br>

>> can lookup that remnant in the table - the whole DIE won't exist anymore.<br>

><br>

><br>

> With "emit-a-CU-at-a-time", we can still keep the type DIEs around since<br>

> they are going to be shared across CUs.<br>

> We can also store a minimal form of the type DIEs to further reduce the<br>

> memory footprint.<br>

><br>

> I agree that we have to change the cross-CU references when we only store a<br>

> minimal form of the type DIEs. But the change should<br>

> not be complicated, we now store a map from MDNode to DIE, and that can be<br>

> changed to map from MDNode to MinimalDIE.<br>

><br>

> "emit-a-CU-at-a-time" is a known solution to reduce memory footprint, but I<br>

> don't have a time line for when I will work on it.<br>

><br>

>><br>

>><br>

>> It seems like the bigger change should go first so we don't avoid<br>

>> discussing a lot of this design only to have to rework it substantially when<br>

>> we emit a CU-at-a-time.<br>

><br>

> Since we already did a lot of discussion about this design and it is not<br>

> complicated (about 30 to 40 lines of code), let's discuss<br>

> "emit-a-CU-at-a-time" when it is somebody's immediate task.<br>

><br>

> Thanks,<br>

> Manman<br>

><br>

>><br>

>>><br>

>>><br>

>>>><br>

>>>><br>

>>>> If not, I'd still probably like to discuss the design of the cross-CU<br>

>>>> referencing work you have here to decide on the right design in a normal<br>

>>>> patch review process rather than trying to think about how the current code<br>

>>>> can be morphed into a better design.<br>

>>><br>

>>><br>

>>> What other designs do you have in mind?<br>

>><br>

>><br>

>> Just some of the stuff we've already been discussing such as merging the<br>

>> maps and simplifying the DwarfDebug entry points. But if you're already<br>

>> planning to do CU-at-a-time, I wonder whether it's worth discussing the<br>

>> design of cross-CU references before we do CU-at-a-time - it seems to me<br>

>> like the implementation would change quite significantly.<br>

>><br>

>>><br>

>>><br>

>>> Manman<br>

>>>><br>

>>>><br>

>>>><br>

>>>> - David<br>

>>><br>

>>><br>

>><br>

><br>

</div></div></blockquote></div></div></div><br></div></div>

</blockquote></div><br></div></div>