[PATCH] D36313: [llvm-dwarfdump] - Print section name and index when dumping .debug_info ranges

Tue Aug 8 08:15:56 PDT 2017

(I'll leave this for a bit to let Adrian and others chime in, hopefully)

On Tue, Aug 8, 2017 at 1:40 AM George Rimar <grimar at accesssoftek.com> wrote:

> >Too much/slow? I'd be surprised if it was very expensive to do a single
> loop over the sections (~hundreds of sections in an object file? Fewer in a
> linked executable), >maybe lazily (when the first range is rendered) - it'd
> only happen once for the whole dump run (& add a lookup in the resulting
> map for each range entry dumping - >which, yeah, that's something, but
> still). dwarfdump's an interactive/user-focussed tool, it probably takes
> longer to print things to the terminal than most of this >computation. (&
> longer still for the user to read it, etc)
>
>
> May be. What I was mean by "too much" is that I have feeling that scaning
> over all sections
>
> anyways should be not neccessary for printing nice output here. I don't
> see the reasons for additional code compliction.
>

The reason is to make the output simpler/easier to read - omitting
information that's not adding value where possible. It's certainly the
benefit isn't worth the extra complexity, but I imagine it's not too bad.

> There are probably 2 possible cases:
>
> Print ".section.name [index]" or "[index] .section.name". For latter case
> we do not need to scan over all sections to
>
> columnize the "[index]" as we can just use total amount of sections to
> calculate padding. So why not to use this form ?
>
I think that form's probably OK for the cases where the index is helpful
(though does feel a bit convoluted/awkward, putting the index before the
name)

>
> >>I would not omit something dependent on third conditions too, because it
> makes logic of output unclear forpeople that are not familar with tool.
> ("Why it prints >>section index for "foo" and does not for "bar" ? BUUUG !")
> >
> >Not sure it'd be particulary bug-like if the section numbers were only
> shown when the section names were ambiguous. Given two examples it would
> seem fairly >clear to me, I think, that the numbers were added to
> disambiguate two sections with the same name.
>
> If somebody going to parse tool's output then it is easier to have
> consistent format.
>

This info's probably not ideal for tool consumption anyway - the ranges are
multi-line output, embedded within the DWARF DIE dumping, etc. & we have
nice APIs to use instead :)

> Also (I'll show below) it can be not convinent or impossible to search for
> a section by name in a section table,
> so if we are going to print section index at least in some cases, I think
> it worth then to print it in all cases.
>

Sure - that's why I was suggesting that the indexes be omitted only when
the section name is unambiguous (hence the need to walk all the sections
up-front to count how many times each name appears).

>
> >>Also I think often I am looking for section number in
> readelf/objdump/other tools,
> >
> >Really? That surprises me - why are the section numbers of interest to
> you? For myself I'm usually interested in the section name.
>
> Ah, section name is often a final point of interest for sure, but numbers
> are very important to have. See:
>
> When I do readelf -a for some file and look at symbol table, I see section
> index "1" for "main",
> using it's index then I can look into section table.
> Symbol table '.symtab' contains 16 entries:
>    Num:    Value          Size Type    Bind   Vis      Ndx Name
>      0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
>      1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS main.c
> ....
>     13: 0000000000000000    21 FUNC    GLOBAL DEFAULT    1 main
>
> But if we would have only name here, like:
> 13: 0000000000000000    21 FUNC    GLOBAL DEFAULT    ".text.foobar" main
>
> It would be harder or impossible to find it by name in section table.
> For example, below is output of following:
> ar -x libclangAnalysis.a
> readelf -a ReachableCode.cpp.o
>
> Section Headers:
>   [Nr] Name              Type             Address           Offset
>        Size              EntSize          Flags  Link  Info  Align
>   [ 0]                   NULL             0000000000000000  00000000
>        0000000000000000  0000000000000000           0     0     0
>   [ 1] .strtab           STRTAB           0000000000000000  003e8080
>        00000000000094e9  0000000000000000           0     0     1
>   [ 2] .text             PROGBITS         0000000000000000  00000040
>        0000000000002c96  0000000000000000  AX       0     0     16
>   [ 3] .rela.text        RELA             0000000000000000  002a7818
>        0000000000001c38  0000000000000018          1486     2     8
>   [ 4] .group            GROUP            0000000000000000  0029ede0
>        0000000000000008  0000000000000004          1486   1136     4
>   [ 5] .text             PROGBITS         0000000000000000  00002ce0
>        0000000000000011  0000000000000000 AXG       0     0     16
>   [ 6] .group            GROUP            0000000000000000  0029ede8
>        000000000000000c  0000000000000004          1486   968     4
>   [ 7] .text             PROGBITS         0000000000000000  00002d00
>        00000000000000a7  0000000000000000 AXG       0     0     16
> <~1500 the same sections here skpped.>
> .....
>
> As you can see section name can be not enough.
>
>
> >I'd skip printing it entirely if it's empty. 'Section: ""' doesn't seem
> helpful.
> >
> >I think printing it on the same line without the "Section: " prefix, as
> it was in your first version, is probably better when it is per-line.
> >
> >I guess maybe you're optimizing this for the case where some set of
> ranges share the same section (so it prints a "Section: "x"" header and
> then all the ranges in >that section? (or all the ranges in that section
> that are contiguous until another section change?)? Though I don't see a
> test case for that (multiple sections, some >ranges next to each other in
> the same section).
> >
>
> You earlier wrote: "& what about omitting the name or putting it
> somewhere else (like at the start on a separate line) if every entry is in
> the same section? (which will be the case for all ranges except the
> compile_unit ranges, most likely)", so that is what I tried to implement.
>
> >But I don't think that's a scenario we realy need to optimize for - it's
> going to be pretty rare that a list contains both multiple sections and
> distinct ranges in the >same section. (function/scope ranges will contain
> ranegs all from the same section and CU ranges will /mostly/ contain all
> ranges from distinct sections - except in >cases of nodebug functions
> putting holes in the range)
>
> George.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170808/31ded36d/attachment-0001.html>