[llvm-dev] Linker option to dump dependency graph

Peter Collingbourne via llvm-dev llvm-dev at lists.llvm.org
Thu Feb 28 21:18:00 PST 2019


You might have realized this already but it's probably not a good idea to
use InputSection::Relocations for this because that ends up missing
anything that becomes a dynamic relocation. I reckon that the code should
be doing exactly what MarkLive.cpp is doing.

Peter

On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> I hacked up a patch to make lld output a dependency graph in the graphviz
> "dot" format.
>
> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa -- this
> is a patch, which is just visiting all input sections and relocations. Note
> that this is far from completion but just a proof-of-concept.
>
> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is a
> result of static-linking a "hello world" program.
>
> https://rui314.github.io/hello.svg  -- I rendered the above dot file with
> graphviz `sfdp` engine. The rendered graph is too large and very hard to
> read. Apparently, I need a better visualization tool.
>
> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com> wrote:
>
>> +1 for graphviz dot format, so that it can be consumed by any one of many
>> existing graph visualization tools.
>>
>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> >To summarise, I think we may
>>> > be able to do quite well with some very simple extra analysis in LLD,
>>> > a machine readable dependency graph would also be very useful for the
>>> > more complex cases.
>>>
>>> Strongly agree. The linker based dependency graph would be very useful
>>> for Uefi firmware. Below are my usage examples:
>>> 1. I need to detect the redundant code in my firmware, and I once wrote
>>> a analysis tool to compare the IR level symbols and call graph info before
>>> any optimization and after full optimization (e.g. LTO). But the IR level
>>> info does not support assembly code info well. So, there are many
>>> dependency information missing and false positive in my analysis tool. It
>>> will be more sound if the linker can help output complete and accurate
>>> dependency graph for final executable.
>>> 2. I need a tool to analyze and  track the firmware module accurate
>>> dependency for build cache soundness. Build performance is now a pain point
>>> in our CI system because every patch need to verify on many build targets
>>> in our side. We hope to enable the build cache (both module level and file
>>> level) to accelerate the build time. For module level build cache enabling,
>>> a very important problem is how to know the module's accurate dependency
>>> efficiently. I'm looking forward to the linker based dependency graph
>>> feature.
>>>
>>>
>>> Thanks
>>> Steven
>>> > -----Original Message-----
>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
>>> Peter
>>> > Smith via llvm-dev
>>> > Sent: Wednesday, February 27, 2019 6:37 PM
>>> > To: Michael Spencer <bigcheesegs at gmail.com>
>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph
>>> >
>>> > Hello,
>>> >
>>> > I think outputting a dependency graph is a good idea and would enable
>>> > some offline analysis. I think that there is some advantage to
>>> > building some of the simpler ones in, particularly those that would
>>> > need heavy annotations to the dependency graph, in particular unless
>>> > we write a sample analysis tool that ships with the release, many
>>> > users are going to miss out on useful features as they aren't going to
>>> > have the time to build one. I've put some comments inline:
>>> >
>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev
>>> > <llvm-dev at lists.llvm.org> wrote:
>>> > >
>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com> wrote:
>>> > >>
>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer
>>> > <bigcheesegs at gmail.com> wrote:
>>> > >>>
>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev <llvm-
>>> > dev at lists.llvm.org> wrote:
>>> > >>>>
>>> > >>>> Hi,
>>> > >>>>
>>> > >>>> I've heard people say that they want to analyze dependencies
>>> between
>>> > object files at the linker level so that they can run a whole-program
>>> analysis
>>> > which cannot be done at the compiler that works for one compilation
>>> unit at
>>> > a time. I'd like to start a discussion as to what we can do with it
>>> and how to
>>> > make it possible. I'm also sharing my idea about how to make it
>>> possible.
>>> > >>>>
>>> > >>>> Dependency analyses
>>> > >>>> First, let me start with a few examples of analyses I'm heard of
>>> or
>>> > thinking about. Dependencies between object files can be represented
>>> as a
>>> > graph where vertices are input sections and edges are symbols and
>>> > relocations. Analyses would work on the dependency graph. Examples of
>>> > analyses include but not limited to the following:
>>> > >>>>
>>> > >>>>  - Figure out why some library or an object file gets linked.
>>> > >>>>
>>> >
>>> > Arm's proprietary linker has a very helpful feature in verbose mode
>>> > where it will report on object loading: global/weak definitions and
>>> > global/weak references. For libraries you'd get a message like
>>> > selecting member.o from library.a to define symbol S. This resulted in
>>> > quite an effective trace of the linker output that could answer most
>>> > "why did this library and object file get loaded question?" One thing
>>> > a dependency graph might not capture is the order in which events
>>> > occur, this can be very useful when debugging problems caused by
>>> > library selection order.
>>> >
>>> > >>>>  - Finding a candidate to eliminate dependency by finding a
>>> "weak" link
>>> > to a library. We can for example say the dependency to a library is
>>> weak if
>>> > the library in the graph can be unreachable if we remove N edges from
>>> the
>>> > graph (which is likely to correspond to removing N function calls from
>>> the
>>> > code), where N is a small number.
>>> > >>>>
>>> > >>>>  - Understanding which of new dependencies increase the executable
>>> > size the most, compare to a previous build.
>>> > >>>>
>>> >
>>> > Arm's linker, being focused on embedded systems has a useful feature
>>> > that summarises the amount of content taken from each object broken
>>> > down into code, ro-data, rw-date etc. This can be helpful in the face
>>> > of comdat group elimination and optimisations such as garbage
>>> > collection and ICF that can be difficult to predict from a dependency
>>> > graph. It is true that this information could be added as attributes
>>> > but again it may just be easier to write a simple analysis pass over
>>> > the output in the linker.
>>> >
>>> > >>>>  - Finding bad or circular dependencies between sub-components.
>>> > >>>>
>>> > >>>> There would be many more analyses you want to run at the linker
>>> input
>>> > level. Currently, lld doesn't actively support such analyses. There
>>> are a few
>>> > options to make the linker emit dependency information (e.g. --cref or
>>> -Map),
>>> > but the output of the options is not comprehensive; you cannot
>>> reconstruct a
>>> > dependency graph from the output of the options.
>>> >
>>> >
>>> >
>>> > >>>>
>>> > >>>> Dumping dependency graph
>>> > >>>> So, I'm thinking if it would be desirable to add a new feature to
>>> the
>>> > linker to dump an entire dependency graph in such a way that a graph
>>> can be
>>> > reconstructed by reading it back. Once we have such feature, we can
>>> link a
>>> > program with the feature enabled and run any kind of dependency
>>> analysis
>>> > on the output. You can save dumps to compare to previous builds. You
>>> can
>>> > run any number of analyses on a dump, instead of invoking the linker
>>> for
>>> > each analysis.
>>> > >>>>
>>> > >>>> I don't have a concrete idea about the file output format, but I
>>> believe
>>> > it is essentially enough to emit triplets of (<from input section>,
>>> <symbol>,
>>> > <to input section>), which represents an edge, to reconstruct a graph.
>>> > >>>>
>>> > >>>> Thoughts?
>>> > >>>
>>> > >>>
>>> > >>> Back when I worked on the linker I pretty much always had a way to
>>> > dump a graphviz dot file to look at things.  Pretty much every graph
>>> > library/tool can read dot files, and they are easy to hack up a parser
>>> for.  You
>>> > can also add attributes to nodes and edges to store arbitrary data.
>>> > >>
>>> > >>
>>> > >> That's an interesting idea.
>>> > >>
>>> > >>> As for what to put it in, it really depends on how detailed it
>>> needs to be.
>>> > Should symbols and sections be collapsed together?  Should it include
>>> > relocation types? Symbol types/binding/size/etc?
>>> > >>
>>> > >>
>>> > >>  Maybe everything? We can for example emit all symbols and input
>>> > sections first, and then emit a graph as the second half of the
>>> output. E.g.
>>> > >>
>>> > >> Symbols:
>>> > >>   <list of symbols>
>>> > >> Sections:
>>> > >>   <list of sections>
>>> > >> Graph:
>>> > >>  1 2 3  // 1st section depends on 3rd section via 2nd symbol
>>> > >>  5 1 4  // likewise
>>> > >
>>> > >
>>> > > I suppose it's a question of if we want users to need to also read
>>> the inputs
>>> > if they want things like section size and other section/symbol
>>> attributes.  It
>>> > would be pretty trivial to include that data as long as we have a
>>> > format/syntax for it.
>>> > >
>>> > > dot supports listing nodes first with attributes and then referring
>>> to them by
>>> > name later when listing edges.
>>> > >
>>> > > - Michael Spencer
>>> > >
>>> >
>>> > I've experimented with dot files for this type of thing in the past.
>>> > The difficulty is that they get too large to be realistically viewed
>>> > very quickly. At that point you need to write scripts to process the
>>> > output and in that case you may as well use JSON or XML, which I guess
>>> > could easily be processed into dot files. To summarise, I think we may
>>> > be able to do quite well with some very simple extra analysis in LLD,
>>> > a machine readable dependency graph would also be very useful for the
>>> > more complex cases.
>>> >
>>> > Peter
>>> >
>>> >  _______________________________________________
>>> > > LLVM Developers mailing list
>>> > > llvm-dev at lists.llvm.org
>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> > _______________________________________________
>>> > LLVM Developers mailing list
>>> > llvm-dev at lists.llvm.org
>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>


-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190228/36fb5e33/attachment.html>


More information about the llvm-dev mailing list