[llvm-dev] Linker option to dump dependency graph

Wed Feb 27 11:34:33 PST 2019

On Wed, Feb 27, 2019 at 2:37 AM Peter Smith <peter.smith at linaro.org> wrote:

> Hello,
>
> I think outputting a dependency graph is a good idea and would enable
> some offline analysis. I think that there is some advantage to
> building some of the simpler ones in, particularly those that would
> need heavy annotations to the dependency graph, in particular unless
> we write a sample analysis tool that ships with the release, many
> users are going to miss out on useful features as they aren't going to
> have the time to build one. I've put some comments inline:
>
> On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> wrote:
> >>
> >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer <bigcheesegs at gmail.com>
> wrote:
> >>>
> >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I've heard people say that they want to analyze dependencies between
> object files at the linker level so that they can run a whole-program
> analysis which cannot be done at the compiler that works for one
> compilation unit at a time. I'd like to start a discussion as to what we
> can do with it and how to make it possible. I'm also sharing my idea about
> how to make it possible.
> >>>>
> >>>> Dependency analyses
> >>>> First, let me start with a few examples of analyses I'm heard of or
> thinking about. Dependencies between object files can be represented as a
> graph where vertices are input sections and edges are symbols and
> relocations. Analyses would work on the dependency graph. Examples of
> analyses include but not limited to the following:
> >>>>
> >>>>  - Figure out why some library or an object file gets linked.
> >>>>
>
> Arm's proprietary linker has a very helpful feature in verbose mode
> where it will report on object loading: global/weak definitions and
> global/weak references. For libraries you'd get a message like
> selecting member.o from library.a to define symbol S. This resulted in
> quite an effective trace of the linker output that could answer most
> "why did this library and object file get loaded question?" One thing
> a dependency graph might not capture is the order in which events
> occur, this can be very useful when debugging problems caused by
> library selection order.
>
> >>>>  - Finding a candidate to eliminate dependency by finding a "weak"
> link to a library. We can for example say the dependency to a library is
> weak if the library in the graph can be unreachable if we remove N edges
> from the graph (which is likely to correspond to removing N function calls
> from the code), where N is a small number.
> >>>>
> >>>>  - Understanding which of new dependencies increase the executable
> size the most, compare to a previous build.
> >>>>
>
> Arm's linker, being focused on embedded systems has a useful feature
> that summarises the amount of content taken from each object broken
> down into code, ro-data, rw-date etc. This can be helpful in the face
> of comdat group elimination and optimisations such as garbage
> collection and ICF that can be difficult to predict from a dependency
> graph. It is true that this information could be added as attributes
> but again it may just be easier to write a simple analysis pass over
> the output in the linker.
>
> >>>>  - Finding bad or circular dependencies between sub-components.
> >>>>
> >>>> There would be many more analyses you want to run at the linker input
> level. Currently, lld doesn't actively support such analyses. There are a
> few options to make the linker emit dependency information (e.g. --cref or
> -Map), but the output of the options is not comprehensive; you cannot
> reconstruct a dependency graph from the output of the options.
>
>
>
> >>>>
> >>>> Dumping dependency graph
> >>>> So, I'm thinking if it would be desirable to add a new feature to the
> linker to dump an entire dependency graph in such a way that a graph can be
> reconstructed by reading it back. Once we have such feature, we can link a
> program with the feature enabled and run any kind of dependency analysis on
> the output. You can save dumps to compare to previous builds. You can run
> any number of analyses on a dump, instead of invoking the linker for each
> analysis.
> >>>>
> >>>> I don't have a concrete idea about the file output format, but I
> believe it is essentially enough to emit triplets of (<from input section>,
> <symbol>, <to input section>), which represents an edge, to reconstruct a
> graph.
> >>>>
> >>>> Thoughts?
> >>>
> >>>
> >>> Back when I worked on the linker I pretty much always had a way to
> dump a graphviz dot file to look at things.  Pretty much every graph
> library/tool can read dot files, and they are easy to hack up a parser
> for.  You can also add attributes to nodes and edges to store arbitrary
> data.
> >>
> >>
> >> That's an interesting idea.
> >>
> >>> As for what to put it in, it really depends on how detailed it needs
> to be.  Should symbols and sections be collapsed together?  Should it
> include relocation types? Symbol types/binding/size/etc?
> >>
> >>
> >>  Maybe everything? We can for example emit all symbols and input
> sections first, and then emit a graph as the second half of the output. E.g.
> >>
> >> Symbols:
> >>   <list of symbols>
> >> Sections:
> >>   <list of sections>
> >> Graph:
> >>  1 2 3  // 1st section depends on 3rd section via 2nd symbol
> >>  5 1 4  // likewise
> >
> >
> > I suppose it's a question of if we want users to need to also read the
> inputs if they want things like section size and other section/symbol
> attributes.  It would be pretty trivial to include that data as long as we
> have a format/syntax for it.
> >
> > dot supports listing nodes first with attributes and then referring to
> them by name later when listing edges.
> >
> > - Michael Spencer
> >
>
> I've experimented with dot files for this type of thing in the past.
> The difficulty is that they get too large to be realistically viewed
> very quickly. At that point you need to write scripts to process the
> output and in that case you may as well use JSON or XML, which I guess
> could easily be processed into dot files. To summarise, I think we may
> be able to do quite well with some very simple extra analysis in LLD,
> a machine readable dependency graph would also be very useful for the
> more complex cases.
>
> Peter
>
>
Indeed they often get too large for graphviz, but there are a few other dot
viewers that can handle huge graphs, such as Gephi.  I often use these when
graphs get too large for graphviz.

- Michael Spencer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190227/7c462982/attachment.html>