[llvm-dev] Linker option to dump dependency graph

Tue Feb 26 14:23:18 PST 2019

Hi,

I've heard people say that they want to analyze dependencies between object
files at the linker level so that they can run a whole-program analysis
which cannot be done at the compiler that works for one compilation unit at
a time. I'd like to start a discussion as to what we can do with it and how
to make it possible. I'm also sharing my idea about how to make it possible.

*Dependency analyses*
First, let me start with a few examples of analyses I'm heard of or
thinking about. Dependencies between object files can be represented as a
graph where vertices are input sections and edges are symbols and
relocations. Analyses would work on the dependency graph. Examples of
analyses include but not limited to the following:

 - Figure out why some library or an object file gets linked.

 - Finding a candidate to eliminate dependency by finding a "weak" link to
a library. We can for example say the dependency to a library is *weak* if
the library in the graph can be unreachable if we remove *N* edges from the
graph (which is likely to correspond to removing *N* function calls from
the code), where *N* is a small number.

 - Understanding which of new dependencies increase the executable size the
most, compare to a previous build.

 - Finding bad or circular dependencies between sub-components.

There would be many more analyses you want to run at the linker input
level. Currently, lld doesn't actively support such analyses. There are a
few options to make the linker emit dependency information (e.g. --cref or
-Map), but the output of the options is not comprehensive; you cannot
reconstruct a dependency graph from the output of the options.

*Dumping dependency graph*
So, I'm thinking if it would be desirable to add a new feature to the
linker to dump an entire dependency graph in such a way that a graph can be
reconstructed by reading it back. Once we have such feature, we can link a
program with the feature enabled and run any kind of dependency analysis on
the output. You can save dumps to compare to previous builds. You can run
any number of analyses on a dump, instead of invoking the linker for each
analysis.

I don't have a concrete idea about the file output format, but I believe it
is essentially enough to emit triplets of (<from input section>, <symbol>,
<to input section>), which represents an edge, to reconstruct a graph.

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190226/b3fd9b27/attachment.html>