[llvm-dev] Linker option to dump dependency graph

Michael Spencer via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 26 16:31:08 PST 2019


On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> wrote:

> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer <bigcheesegs at gmail.com>
> wrote:
>
>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi,
>>>
>>> I've heard people say that they want to analyze dependencies between
>>> object files at the linker level so that they can run a whole-program
>>> analysis which cannot be done at the compiler that works for one
>>> compilation unit at a time. I'd like to start a discussion as to what we
>>> can do with it and how to make it possible. I'm also sharing my idea about
>>> how to make it possible.
>>>
>>> *Dependency analyses*
>>> First, let me start with a few examples of analyses I'm heard of or
>>> thinking about. Dependencies between object files can be represented as a
>>> graph where vertices are input sections and edges are symbols and
>>> relocations. Analyses would work on the dependency graph. Examples of
>>> analyses include but not limited to the following:
>>>
>>>  - Figure out why some library or an object file gets linked.
>>>
>>>  - Finding a candidate to eliminate dependency by finding a "weak" link
>>> to a library. We can for example say the dependency to a library is
>>> *weak* if the library in the graph can be unreachable if we remove *N* edges
>>> from the graph (which is likely to correspond to removing *N* function
>>> calls from the code), where *N* is a small number.
>>>
>>>  - Understanding which of new dependencies increase the executable size
>>> the most, compare to a previous build.
>>>
>>>  - Finding bad or circular dependencies between sub-components.
>>>
>>> There would be many more analyses you want to run at the linker input
>>> level. Currently, lld doesn't actively support such analyses. There are a
>>> few options to make the linker emit dependency information (e.g. --cref or
>>> -Map), but the output of the options is not comprehensive; you cannot
>>> reconstruct a dependency graph from the output of the options.
>>>
>>> *Dumping dependency graph*
>>> So, I'm thinking if it would be desirable to add a new feature to the
>>> linker to dump an entire dependency graph in such a way that a graph can be
>>> reconstructed by reading it back. Once we have such feature, we can link a
>>> program with the feature enabled and run any kind of dependency analysis on
>>> the output. You can save dumps to compare to previous builds. You can run
>>> any number of analyses on a dump, instead of invoking the linker for each
>>> analysis.
>>>
>>> I don't have a concrete idea about the file output format, but I believe
>>> it is essentially enough to emit triplets of (<from input section>,
>>> <symbol>, <to input section>), which represents an edge, to reconstruct a
>>> graph.
>>>
>>> Thoughts?
>>>
>>
>> Back when I worked on the linker I pretty much always had a way to dump a
>> graphviz dot file to look at things.  Pretty much every graph library/tool
>> can read dot files, and they are easy to hack up a parser for.  You can
>> also add attributes to nodes and edges to store arbitrary data.
>>
>
> That's an interesting idea.
>
> As for what to put it in, it really depends on how detailed it needs to
>> be.  Should symbols and sections be collapsed together?  Should it include
>> relocation types? Symbol types/binding/size/etc?
>>
>
>  Maybe everything? We can for example emit all symbols and input sections
> first, and then emit a graph as the second half of the output. E.g.
>
> Symbols:
>   <list of symbols>
> Sections:
>   <list of sections>
> Graph:
>  1 2 3  // 1st section depends on 3rd section via 2nd symbol
>  5 1 4  // likewise
>

I suppose it's a question of if we want users to need to also read the
inputs if they want things like section size and other section/symbol
attributes.  It would be pretty trivial to include that data as long as we
have a format/syntax for it.

dot supports listing nodes first with attributes and then referring to them
by name later when listing edges.

- Michael Spencer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190226/98b7748a/attachment.html>


More information about the llvm-dev mailing list