[PATCH] D40950: [ELF] - Fail when multiple .debug_* sections are used in a single object.

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 11 07:40:28 PST 2017


On Mon, Dec 11, 2017 at 7:14 AM James Henderson <
jh7370.2008 at my.bristol.ac.uk> wrote:

> Hi,
>
> My colleagues and I have actually been doing quite a bit of investigation
> in the past couple of weeks into what to do about debug data that refers to
> sections that are discarded due to --gc-sections etc. Currently, LLD treats
> the references as address zero, but at least on our platform, address zero
> is a valid address, so we were looking at alternatives. One of the ideas
> that I personally looked at was based on section E.3 of the DWARF4/5
> specifications, i.e. having a separate .debug_info section for each
> subprogram, each of which imports from a "main" .debug_info section
> containing shared information. I was able to prototype a change to LLVM
> that did this, without too much effort, given that I hadn't previously
> looked in that area of code before. I then had to modify LLD to recognise
> these new info sections as being dependent on the corresponding text
> section, which was a relatively simple change to make, thanks to the
> mechanism for dependent sections already being there.
>

It might be possible to avoid this linker-awareness/dependence by putting
the debug info and function in the same COMDAT group?


> It didn't require inspecting the contents of the section, only the section
> name in my case (the sections in my experiment were named things like
> .debug_info._Z3foov). I also prototyped changes to llvm-dwarfdump to print
> the .debug_info for these multiple sections. Again, this was relatively
> simple, although I certainly didn't experiment beyond a very simply case.
> Similar investigations are being conducted for having multiple of other
> .debug_* sections.
>
> We haven't yet decided whether this is an approach we'd like to push for,
> as there are some downsides (e.g. larger debug information)
>

Yeah, that'd be my immediate concern - be interesting to see the data on
how much larger. (also there might be some things we could do to improve
inter-unit referencing that could reduce some of that overhead, also
helping type units in the process)

But I'm still sort of leaning towards "let's have a good library for fully
DWARF aware merging and use it in llvm-dsymutil (which already has this
functionality, so might be the basis for such a library), llvm-dwp, and
lld". It'd be a pretty significant win in debug info size, but I understand
it's a bit of a weird layering violation compared to previous DWARF work,
so realize this may take some discussion to see if there's general
consensus for that direction.


> but my point is that I wouldn't be so certain that we won't need to handle
> multiple .debug_info sections correctly in the relatively near term, so I'd
> much rather option 2).
>
> Regards,
>
> James
>
> On 8 December 2017 at 08:49, George Rimar via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>> Thanks for commens, David ! Let me answer from the end.
>>
>>
>> >It seems strange to me that the linker would special case the debug_*
>> sections
>>
>> >(& that that special casing would limit how they can be used) given the
>> discussion we were
>>
>> >having about the linker wanting to treat debug info as just normal
>> sections.
>>
>>
>> There are two different things about handing such sections in linker.
>>
>> We are talking about multiple .debug_info*, where there is some main
>> .debug_info and
>>
>> multiple COMDAT .debug_info* section with types.
>>
>> Linker do deduplication of COMDATs and does not care about section
>> names, flags etc,
>>
>> so it handles debug_* sections just as normal sections and that is
>> already works as expected.
>>
>>
>> Issue my patch trying to address is a different case and comes
>> from llvm::DWARFObject class. We use it when
>>
>> LLD do error reporting (to get information about source lines) and for
>> --gdb-index. In both cases it relies on current
>>
>> DWARFObject API that among other things assumes that there are single
>> .debug_info, .debug_abbrev and .debug_line.
>>
>> (
>> https://github.com/llvm-mirror/llvm/blob/master/include/llvm/DebugInfo/DWARF/DWARFObject.h#L36
>> )
>>
>>
>> What I really wanted to do in this patch is to error out when this class
>> is used in case multiple of above sections are present. Because
>>
>> with current implementation result would be just a mess, we would use the
>> last debug section with a given name only and result would be
>>
>> undefined.
>>
>> My approach was not entirelly correct, as you mentioned it would not work
>> in case of multiple .debug_types. Honestly I forgot
>>
>> about them when tried to simplify code, we do not use this sections
>> there and had no test to fail, I`ll add it.
>>
>>
>> >What's the error handling case you mention?
>>
>>
>> One of use cases I mentioned above (for reference our code is
>> https://github.com/llvm-mirror/lld/blob/master/ELF/InputFiles.cpp#L73​),
>>
>> in short we use .debug_line​ for reporting error locations and
>> scan ​.debug_info for variables to report them for reporting duplicate
>>
>> declarations.
>>
>>>>
>> >Presumably it works for multiple debug_types sections, so perhaps that
>> support could be generalized to multiple
>>
>> >debug_info sections as well? Then this failure could be restricted to
>> only multiple debug_info sections when using gdb-index?
>> >
>> >That way DWARF5 type units (that use debug_info sections) would just
>> work? (except when using gdb-index)
>>
>> I do not see way it could be restricted to only --gdb-index case. We
>> discussed multiple .debug_info case only I think,
>> but looks important point that DWARF5 spec at p366 (
>> http://dwarfstd.org/doc/DWARF5.pdf) also mentions it can be multiple of
>> .debug_abrev,
>> .debug_info, .debug_line sections. We use them for error reporting as
>> well. (not we, but llvm::DWARFObject we rely on actually).
>>
>> Given all above I see 2 solutions.
>> 1) As far I can tell there is no producers of multiple
>> .debug_info/.debug_abbrev/.debug_line yet. And at least basing on our
>> discussions looks we do not plan to support it in closest future (and it
>> is actually unclear we will).
>> That is why I prepared patch using less intrusive approach to restrict
>> multiple debug sections. What can I do for it
>> is to fix/update it so it will just black list sections we do not expect
>> to be not unique.
>>
>> 2) We could teach llvm::DWARFObject to handle multiple
>>  .debug_info/.debug_abbrev/.debug_line. That would be helpful
>> for tools like llvm-dwarfdump (in case they be used for some wild object
>> with such sections) and would allow LLD to work with such objects.
>> Given our discussion I am not sure it is useful/worth to do, but I will
>> be happy to try to implement it if such way be chosen.
>>
>> What do you guys think, what should we do ?
>>
>> George.
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171211/b3f93a06/attachment.html>


More information about the llvm-commits mailing list