[PATCH] Remove wild .debug_aranges entries generated from unimportant labels

Eric Christopher echristo at gmail.com
Wed Oct 2 22:46:43 PDT 2013


On Wed, Oct 2, 2013 at 6:05 PM, Richard Mitton <richard at codersnotes.com>wrote:

>  It seems though that common symbols get converted into just regular BSS
> data upon link? (they do on my system at least). Common symbols seem to get
> assigned a valid address and size, not zero. I agree that writing out zero
> entries would be stupid, but that doesn't seem to be what we're doing.
>
>
Yep. And we've got a relocation, at least for the common symbols, so they
should get relocated at link time.


> I think I'd rather see the same address listed under two CUs (if it indeed
> appears in both CUs), than not at all.
>
>
Fair enough. Given we don't have a consumer it seems like bloat/complexity
for no use at the moment (much like column information unfortunately), but
I'm not wedded to removing it either.


> (side note: IMHO llvm shouldn't really even support common symbols, surely
> comdat would be a better way of doing the same thing? And then we could
> emit debug_info chunks as comdat too and the problem would resolve itself.
> Well I can dream...)
>

Mmm, type units will solve a huge chunk of the duplicate info problem,
though we'd have to emit pieces of the arange table as comdat and not
unique it? That sounds... complicated :)

At any rate, Alexey's patch still seems fine and we can go with that for
now and discuss anything else later.

-eric


>
>
> Richard Mittonrichard at codersnotes.com
>
> On 10/02/2013 05:53 PM, Eric Christopher wrote:
>
>
>
>
> On Wed, Oct 2, 2013 at 4:21 PM, Richard Mitton <richard at codersnotes.com>wrote:
>
>> I'd rather keep it in. I'm sure gcc might not emit the arange table
>> correctly, but the DWARF specs are quite clear that the idea of an arange
>> table is to map *every* byte in the program to it's debug CU. Not just
>> text. If there's a reference in the debug_info to an address, the arange
>> table should have the reverse of that. This is why I used the label list to
>> generate it, rather than trying to pick out functions/variables/etc. The
>> labels added to debug_info define exactly the set of addresses required.
>>
>>
>  Data in general is fine to keep in, however, ...
>
>
>>  Common symbols have an address in the final program, so they should be
>> included too.
>>
>>
>  This I'm going to disagree with, and the dwarf standard seems to back me
> up here:
>
>  The table consists of sets of variable length entries, each set
> describing the portion of the program’s address space that is covered by a
> single compilation unit.
>
>  which says to me that the only items that should be included in the
> aranges are items that are going to be unique to that compilation unit -
> which would leave out common data. And a debugger can always use the symbol
> table here anyhow. I don't know that we want a bunch of entries in the
> aranges table that point to an address of 0 with a length of 1 - it doesn't
> strike me as useful or desirable.
>
>
>>  Debuggers historically don't always use the full capabilities of the
>> DWARF data, because the compilers don't generate correct data. If we fix
>> the compiler here once and for all, it enables future debuggers to make use
>> of it.
>>
>>
>  Heh. Except they won't, but I understand the point you're trying to
> make. FWIW I'm attempting to get rid of pubnames and pubtypes as well.
>
>
>>  The biggest challenge faced by debuggers today is compilers which only
>> emit barely enough DWARF to function. We can do better than that.
>>
>>
>  Agreed, somewhat. The problem is that they have to work with what they
> can depend upon.
>
>  -eric
>
>
>>     Eric Christopher <echristo at gmail.com>
>>  Wednesday, October 02, 2013 2:35 PM
>>   *nod* The fix is fine.
>>
>> I think the additional complication that got us here is the adding of
>> data symbols to aranges. Now, as far as I can tell, gcc doesn't emit
>> aranges for any of them, however, the real problem for us are common
>> symbols. I think we want to omit ranges for common symbols - it
>> doesn't really seem to make sense anyhow.
>>
>> I know that gdb and the various gnu tools don't use that part of the
>> information, but I don't know of any other users. So objections to
>> removing that part? It'll definitely greatly simplify the code.
>>
>> -eric
>>
>>     Richard Mitton <richard at codersnotes.com>
>>  Wednesday, October 02, 2013 2:00 PM
>>   LGTM, although I dunno if I'm actually authorized to approve patches :)
>>
>> There's code in there already that ignores labels in metadata sections,
>> but it wasn't getting triggered because debug_loc labels are added after
>> the test for it. And the stupid sectionless common symbols mean that it
>> couldn't just ignore NULL sections either.
>>
>> This looks like a fine fix.
>>
>> Richard Mitton
>> richard at codersnotes.com
>> On 10/02/2013 01:33 PM, Alexey Samsonov wrote:
>>
>>  _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131002/bfd17065/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1417b7a1a41d92ce_compose-unknown-contact.jpg
Type: image/jpeg
Size: 770 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131002/bfd17065/attachment.jpg>


More information about the llvm-commits mailing list