[llvm-dev] Linker option to dump dependency graph

Fri Jun 21 08:15:04 PDT 2019

You may use -y foo to trace a symbol. -t and -M are also useful.

On Fri, Jun 21, 2019 at 10:08 PM Andrew Grieve <agrieve at chromium.org> wrote:

> I didn't pay much attention when "replying all". I did actually mean to
> ask you :).
>
> It's coming up repeatedly in Chrome that I want to be able to find the
> reason why a symbol is included, so even if there's a patch I could pull in
> myself to answer these queries, that would be appreciated :).
>
> On Fri, Jun 21, 2019 at 8:10 AM Rui Ueyama <ruiu at google.com> wrote:
>
>> Sorry, I didn't notice that you are asking not to me but to Fangrui.
>> Please disregard my previous email.
>>
>> On Fri, Jun 21, 2019 at 9:08 PM Rui Ueyama <ruiu at google.com> wrote:
>>
>>> No I didn't.
>>>
>>> On Fri, Jun 21, 2019 at 10:52 AM Andrew Grieve via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Just wanted to check in on this - did your patches make it past the
>>>> prototype phase?
>>>>
>>>> On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> > One thing a dependency graph might not capture is the order in which
>>>>> events occur, this can be very useful when debugging problems caused by
>>>>> library selection order.
>>>>>
>>>>> The event stream sounds like a more fine-grained --trace (-t).
>>>>>
>>>>> > (<from input section>, <symbol>, <to input section>)
>>>>>
>>>>> In --no-gc-sections mode and in some analysis, the file name part of
>>>>> the input section should be good enough.
>>>>>
>>>>> > section size and other section/symbol attributes
>>>>>
>>>>> If such customization is favored and the complexity isn't a big issue,
>>>>> it can probably be implemented as format specifiers (I'm thinking of
>>>>> printf, ps -o, date, ...). The design of
>>>>> https://github.com/Juniper/libxo can be used for reference.
>>>>>
>>>>> We shall flesh out the possible vertex/edge types and additional
>>>>> information that users may expect.
>>>>>
>>>>> On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev
>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>> >
>>>>> > You might have realized this already but it's probably not a good
>>>>> idea to use InputSection::Relocations for this because that ends up missing
>>>>> anything that becomes a dynamic relocation. I reckon that the code should
>>>>> be doing exactly what MarkLive.cpp is doing.
>>>>> >
>>>>> > Peter
>>>>> >
>>>>> > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>> >>
>>>>> >> I hacked up a patch to make lld output a dependency graph in the
>>>>> graphviz "dot" format.
>>>>> >>
>>>>> >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa --
>>>>> this is a patch, which is just visiting all input sections and relocations.
>>>>> Note that this is far from completion but just a proof-of-concept.
>>>>> >>
>>>>> >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc is
>>>>> a result of static-linking a "hello world" program.
>>>>> >>
>>>>> >> https://rui314.github.io/hello.svg  -- I rendered the above dot
>>>>> file with graphviz `sfdp` engine. The rendered graph is too large and very
>>>>> hard to read. Apparently, I need a better visualization tool.
>>>>> >>
>>>>> >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>> +1 for graphviz dot format, so that it can be consumed by any one
>>>>> of many existing graph visualization tools.
>>>>> >>>
>>>>> >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>> >>>>
>>>>> >>>> >To summarise, I think we may
>>>>> >>>> > be able to do quite well with some very simple extra analysis
>>>>> in LLD,
>>>>> >>>> > a machine readable dependency graph would also be very useful
>>>>> for the
>>>>> >>>> > more complex cases.
>>>>> >>>>
>>>>> >>>> Strongly agree. The linker based dependency graph would be very
>>>>> useful for Uefi firmware. Below are my usage examples:
>>>>> >>>> 1. I need to detect the redundant code in my firmware, and I once
>>>>> wrote a analysis tool to compare the IR level symbols and call graph info
>>>>> before any optimization and after full optimization (e.g. LTO). But the IR
>>>>> level info does not support assembly code info well. So, there are many
>>>>> dependency information missing and false positive in my analysis tool. It
>>>>> will be more sound if the linker can help output complete and accurate
>>>>> dependency graph for final executable.
>>>>> >>>> 2. I need a tool to analyze and  track the firmware module
>>>>> accurate dependency for build cache soundness. Build performance is now a
>>>>> pain point in our CI system because every patch need to verify on many
>>>>> build targets in our side. We hope to enable the build cache (both module
>>>>> level and file level) to accelerate the build time. For module level build
>>>>> cache enabling, a very important problem is how to know the module's
>>>>> accurate dependency efficiently. I'm looking forward to the linker based
>>>>> dependency graph feature.
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> Thanks
>>>>> >>>> Steven
>>>>> >>>> > -----Original Message-----
>>>>> >>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On
>>>>> Behalf Of Peter
>>>>> >>>> > Smith via llvm-dev
>>>>> >>>> > Sent: Wednesday, February 27, 2019 6:37 PM
>>>>> >>>> > To: Michael Spencer <bigcheesegs at gmail.com>
>>>>> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
>>>>> >>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph
>>>>> >>>> >
>>>>> >>>> > Hello,
>>>>> >>>> >
>>>>> >>>> > I think outputting a dependency graph is a good idea and would
>>>>> enable
>>>>> >>>> > some offline analysis. I think that there is some advantage to
>>>>> >>>> > building some of the simpler ones in, particularly those that
>>>>> would
>>>>> >>>> > need heavy annotations to the dependency graph, in particular
>>>>> unless
>>>>> >>>> > we write a sample analysis tool that ships with the release,
>>>>> many
>>>>> >>>> > users are going to miss out on useful features as they aren't
>>>>> going to
>>>>> >>>> > have the time to build one. I've put some comments inline:
>>>>> >>>> >
>>>>> >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev
>>>>> >>>> > <llvm-dev at lists.llvm.org> wrote:
>>>>> >>>> > >
>>>>> >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com>
>>>>> wrote:
>>>>> >>>> > >>
>>>>> >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer
>>>>> >>>> > <bigcheesegs at gmail.com> wrote:
>>>>> >>>> > >>>
>>>>> >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev
>>>>> <llvm-
>>>>> >>>> > dev at lists.llvm.org> wrote:
>>>>> >>>> > >>>>
>>>>> >>>> > >>>> Hi,
>>>>> >>>> > >>>>
>>>>> >>>> > >>>> I've heard people say that they want to analyze
>>>>> dependencies between
>>>>> >>>> > object files at the linker level so that they can run a
>>>>> whole-program analysis
>>>>> >>>> > which cannot be done at the compiler that works for one
>>>>> compilation unit at
>>>>> >>>> > a time. I'd like to start a discussion as to what we can do
>>>>> with it and how to
>>>>> >>>> > make it possible. I'm also sharing my idea about how to make it
>>>>> possible.
>>>>> >>>> > >>>>
>>>>> >>>> > >>>> Dependency analyses
>>>>> >>>> > >>>> First, let me start with a few examples of analyses I'm
>>>>> heard of or
>>>>> >>>> > thinking about. Dependencies between object files can be
>>>>> represented as a
>>>>> >>>> > graph where vertices are input sections and edges are symbols
>>>>> and
>>>>> >>>> > relocations. Analyses would work on the dependency graph.
>>>>> Examples of
>>>>> >>>> > analyses include but not limited to the following:
>>>>> >>>> > >>>>
>>>>> >>>> > >>>>  - Figure out why some library or an object file gets
>>>>> linked.
>>>>> >>>> > >>>>
>>>>> >>>> >
>>>>> >>>> > Arm's proprietary linker has a very helpful feature in verbose
>>>>> mode
>>>>> >>>> > where it will report on object loading: global/weak definitions
>>>>> and
>>>>> >>>> > global/weak references. For libraries you'd get a message like
>>>>> >>>> > selecting member.o from library.a to define symbol S. This
>>>>> resulted in
>>>>> >>>> > quite an effective trace of the linker output that could answer
>>>>> most
>>>>> >>>> > "why did this library and object file get loaded question?" One
>>>>> thing
>>>>> >>>> > a dependency graph might not capture is the order in which
>>>>> events
>>>>> >>>> > occur, this can be very useful when debugging problems caused by
>>>>> >>>> > library selection order.
>>>>> >>>> >
>>>>> >>>> > >>>>  - Finding a candidate to eliminate dependency by finding
>>>>> a "weak" link
>>>>> >>>> > to a library. We can for example say the dependency to a
>>>>> library is weak if
>>>>> >>>> > the library in the graph can be unreachable if we remove N
>>>>> edges from the
>>>>> >>>> > graph (which is likely to correspond to removing N function
>>>>> calls from the
>>>>> >>>> > code), where N is a small number.
>>>>> >>>> > >>>>
>>>>> >>>> > >>>>  - Understanding which of new dependencies increase the
>>>>> executable
>>>>> >>>> > size the most, compare to a previous build.
>>>>> >>>> > >>>>
>>>>> >>>> >
>>>>> >>>> > Arm's linker, being focused on embedded systems has a useful
>>>>> feature
>>>>> >>>> > that summarises the amount of content taken from each object
>>>>> broken
>>>>> >>>> > down into code, ro-data, rw-date etc. This can be helpful in
>>>>> the face
>>>>> >>>> > of comdat group elimination and optimisations such as garbage
>>>>> >>>> > collection and ICF that can be difficult to predict from a
>>>>> dependency
>>>>> >>>> > graph. It is true that this information could be added as
>>>>> attributes
>>>>> >>>> > but again it may just be easier to write a simple analysis pass
>>>>> over
>>>>> >>>> > the output in the linker.
>>>>> >>>> >
>>>>> >>>> > >>>>  - Finding bad or circular dependencies between
>>>>> sub-components.
>>>>> >>>> > >>>>
>>>>> >>>> > >>>> There would be many more analyses you want to run at the
>>>>> linker input
>>>>> >>>> > level. Currently, lld doesn't actively support such analyses.
>>>>> There are a few
>>>>> >>>> > options to make the linker emit dependency information (e.g.
>>>>> --cref or -Map),
>>>>> >>>> > but the output of the options is not comprehensive; you cannot
>>>>> reconstruct a
>>>>> >>>> > dependency graph from the output of the options.
>>>>> >>>> >
>>>>> >>>> >
>>>>> >>>> >
>>>>> >>>> > >>>>
>>>>> >>>> > >>>> Dumping dependency graph
>>>>> >>>> > >>>> So, I'm thinking if it would be desirable to add a new
>>>>> feature to the
>>>>> >>>> > linker to dump an entire dependency graph in such a way that a
>>>>> graph can be
>>>>> >>>> > reconstructed by reading it back. Once we have such feature, we
>>>>> can link a
>>>>> >>>> > program with the feature enabled and run any kind of dependency
>>>>> analysis
>>>>> >>>> > on the output. You can save dumps to compare to previous
>>>>> builds. You can
>>>>> >>>> > run any number of analyses on a dump, instead of invoking the
>>>>> linker for
>>>>> >>>> > each analysis.
>>>>> >>>> > >>>>
>>>>> >>>> > >>>> I don't have a concrete idea about the file output format,
>>>>> but I believe
>>>>> >>>> > it is essentially enough to emit triplets of (<from input
>>>>> section>, <symbol>,
>>>>> >>>> > <to input section>), which represents an edge, to reconstruct a
>>>>> graph.
>>>>> >>>> > >>>>
>>>>> >>>> > >>>> Thoughts?
>>>>> >>>> > >>>
>>>>> >>>> > >>>
>>>>> >>>> > >>> Back when I worked on the linker I pretty much always had a
>>>>> way to
>>>>> >>>> > dump a graphviz dot file to look at things.  Pretty much every
>>>>> graph
>>>>> >>>> > library/tool can read dot files, and they are easy to hack up a
>>>>> parser for.  You
>>>>> >>>> > can also add attributes to nodes and edges to store arbitrary
>>>>> data.
>>>>> >>>> > >>
>>>>> >>>> > >>
>>>>> >>>> > >> That's an interesting idea.
>>>>> >>>> > >>
>>>>> >>>> > >>> As for what to put it in, it really depends on how detailed
>>>>> it needs to be.
>>>>> >>>> > Should symbols and sections be collapsed together?  Should it
>>>>> include
>>>>> >>>> > relocation types? Symbol types/binding/size/etc?
>>>>> >>>> > >>
>>>>> >>>> > >>
>>>>> >>>> > >>  Maybe everything? We can for example emit all symbols and
>>>>> input
>>>>> >>>> > sections first, and then emit a graph as the second half of the
>>>>> output. E.g.
>>>>> >>>> > >>
>>>>> >>>> > >> Symbols:
>>>>> >>>> > >>   <list of symbols>
>>>>> >>>> > >> Sections:
>>>>> >>>> > >>   <list of sections>
>>>>> >>>> > >> Graph:
>>>>> >>>> > >>  1 2 3  // 1st section depends on 3rd section via 2nd symbol
>>>>> >>>> > >>  5 1 4  // likewise
>>>>> >>>> > >
>>>>> >>>> > >
>>>>> >>>> > > I suppose it's a question of if we want users to need to also
>>>>> read the inputs
>>>>> >>>> > if they want things like section size and other section/symbol
>>>>> attributes.  It
>>>>> >>>> > would be pretty trivial to include that data as long as we have
>>>>> a
>>>>> >>>> > format/syntax for it.
>>>>> >>>> > >
>>>>> >>>> > > dot supports listing nodes first with attributes and then
>>>>> referring to them by
>>>>> >>>> > name later when listing edges.
>>>>> >>>> > >
>>>>> >>>> > > - Michael Spencer
>>>>> >>>> > >
>>>>> >>>> >
>>>>> >>>> > I've experimented with dot files for this type of thing in the
>>>>> past.
>>>>> >>>> > The difficulty is that they get too large to be realistically
>>>>> viewed
>>>>> >>>> > very quickly. At that point you need to write scripts to
>>>>> process the
>>>>> >>>> > output and in that case you may as well use JSON or XML, which
>>>>> I guess
>>>>> >>>> > could easily be processed into dot files. To summarise, I think
>>>>> we may
>>>>> >>>> > be able to do quite well with some very simple extra analysis
>>>>> in LLD,
>>>>> >>>> > a machine readable dependency graph would also be very useful
>>>>> for the
>>>>> >>>> > more complex cases.
>>>>> >>>> >
>>>>> >>>> > Peter
>>>>> >>>> >
>>>>> >>>> >  _______________________________________________
>>>>> >>>> > > LLVM Developers mailing list
>>>>> >>>> > > llvm-dev at lists.llvm.org
>>>>> >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> >>>> > _______________________________________________
>>>>> >>>> > LLVM Developers mailing list
>>>>> >>>> > llvm-dev at lists.llvm.org
>>>>> >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> >>>> _______________________________________________
>>>>> >>>> LLVM Developers mailing list
>>>>> >>>> llvm-dev at lists.llvm.org
>>>>> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> >>
>>>>> >> _______________________________________________
>>>>> >> LLVM Developers mailing list
>>>>> >> llvm-dev at lists.llvm.org
>>>>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > --
>>>>> > Peter
>>>>> > _______________________________________________
>>>>> > LLVM Developers mailing list
>>>>> > llvm-dev at lists.llvm.org
>>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> 宋方睿
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>

-- 
宋方睿
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190621/47b403a7/attachment.html>