[llvm-dev] Linker option to dump dependency graph

Thu Oct 22 18:10:40 PDT 2020

I'm still very much interested in this feature. Wondering if an llvm bug
was filed to track it? If not, would it be useful for me to file one?

On Fri, Jun 21, 2019 at 11:15 AM Fāng-ruì Sòng <maskray at google.com> wrote:

> You may use -y foo to trace a symbol. -t and -M are also useful.
>
> On Fri, Jun 21, 2019 at 10:08 PM Andrew Grieve <agrieve at chromium.org>
> wrote:
>
>> I didn't pay much attention when "replying all". I did actually mean to
>> ask you :).
>>
>> It's coming up repeatedly in Chrome that I want to be able to find the
>> reason why a symbol is included, so even if there's a patch I could pull in
>> myself to answer these queries, that would be appreciated :).
>>
>> On Fri, Jun 21, 2019 at 8:10 AM Rui Ueyama <ruiu at google.com> wrote:
>>
>>> Sorry, I didn't notice that you are asking not to me but to Fangrui.
>>> Please disregard my previous email.
>>>
>>> On Fri, Jun 21, 2019 at 9:08 PM Rui Ueyama <ruiu at google.com> wrote:
>>>
>>>> No I didn't.
>>>>
>>>> On Fri, Jun 21, 2019 at 10:52 AM Andrew Grieve via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Just wanted to check in on this - did your patches make it past the
>>>>> prototype phase?
>>>>>
>>>>> On Tue, Mar 5, 2019 at 2:41 AM Fāng-ruì Sòng via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> > One thing a dependency graph might not capture is the order in
>>>>>> which events occur, this can be very useful when debugging problems caused
>>>>>> by library selection order.
>>>>>>
>>>>>> The event stream sounds like a more fine-grained --trace (-t).
>>>>>>
>>>>>> > (<from input section>, <symbol>, <to input section>)
>>>>>>
>>>>>> In --no-gc-sections mode and in some analysis, the file name part of
>>>>>> the input section should be good enough.
>>>>>>
>>>>>> > section size and other section/symbol attributes
>>>>>>
>>>>>> If such customization is favored and the complexity isn't a big issue,
>>>>>> it can probably be implemented as format specifiers (I'm thinking of
>>>>>> printf, ps -o, date, ...). The design of
>>>>>> https://github.com/Juniper/libxo can be used for reference.
>>>>>>
>>>>>> We shall flesh out the possible vertex/edge types and additional
>>>>>> information that users may expect.
>>>>>>
>>>>>> On Fri, Mar 1, 2019 at 1:18 PM Peter Collingbourne via llvm-dev
>>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>> >
>>>>>> > You might have realized this already but it's probably not a good
>>>>>> idea to use InputSection::Relocations for this because that ends up missing
>>>>>> anything that becomes a dynamic relocation. I reckon that the code should
>>>>>> be doing exactly what MarkLive.cpp is doing.
>>>>>> >
>>>>>> > Peter
>>>>>> >
>>>>>> > On Thu, Feb 28, 2019 at 5:15 PM Rui Ueyama via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>> >>
>>>>>> >> I hacked up a patch to make lld output a dependency graph in the
>>>>>> graphviz "dot" format.
>>>>>> >>
>>>>>> >> https://gist.github.com/rui314/4eab9f328a5568b682d11c84d328cdaa
>>>>>> -- this is a patch, which is just visiting all input sections and
>>>>>> relocations. Note that this is far from completion but just a
>>>>>> proof-of-concept.
>>>>>> >>
>>>>>> >> https://gist.github.com/rui314/5e85c559835ecddad46dcf02fe3ffafc
>>>>>> is a result of static-linking a "hello world" program.
>>>>>> >>
>>>>>> >> https://rui314.github.io/hello.svg  -- I rendered the above dot
>>>>>> file with graphviz `sfdp` engine. The rendered graph is too large and very
>>>>>> hard to read. Apparently, I need a better visualization tool.
>>>>>> >>
>>>>>> >> On Wed, Feb 27, 2019 at 7:56 PM Zachary Turner <zturner at google.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>> +1 for graphviz dot format, so that it can be consumed by any one
>>>>>> of many existing graph visualization tools.
>>>>>> >>>
>>>>>> >>> On Wed, Feb 27, 2019 at 7:29 PM Shi, Steven via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>> >>>>
>>>>>> >>>> >To summarise, I think we may
>>>>>> >>>> > be able to do quite well with some very simple extra analysis
>>>>>> in LLD,
>>>>>> >>>> > a machine readable dependency graph would also be very useful
>>>>>> for the
>>>>>> >>>> > more complex cases.
>>>>>> >>>>
>>>>>> >>>> Strongly agree. The linker based dependency graph would be very
>>>>>> useful for Uefi firmware. Below are my usage examples:
>>>>>> >>>> 1. I need to detect the redundant code in my firmware, and I
>>>>>> once wrote a analysis tool to compare the IR level symbols and call graph
>>>>>> info before any optimization and after full optimization (e.g. LTO). But
>>>>>> the IR level info does not support assembly code info well. So, there are
>>>>>> many dependency information missing and false positive in my analysis tool.
>>>>>> It will be more sound if the linker can help output complete and accurate
>>>>>> dependency graph for final executable.
>>>>>> >>>> 2. I need a tool to analyze and  track the firmware module
>>>>>> accurate dependency for build cache soundness. Build performance is now a
>>>>>> pain point in our CI system because every patch need to verify on many
>>>>>> build targets in our side. We hope to enable the build cache (both module
>>>>>> level and file level) to accelerate the build time. For module level build
>>>>>> cache enabling, a very important problem is how to know the module's
>>>>>> accurate dependency efficiently. I'm looking forward to the linker based
>>>>>> dependency graph feature.
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>> Thanks
>>>>>> >>>> Steven
>>>>>> >>>> > -----Original Message-----
>>>>>> >>>> > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On
>>>>>> Behalf Of Peter
>>>>>> >>>> > Smith via llvm-dev
>>>>>> >>>> > Sent: Wednesday, February 27, 2019 6:37 PM
>>>>>> >>>> > To: Michael Spencer <bigcheesegs at gmail.com>
>>>>>> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
>>>>>> >>>> > Subject: Re: [llvm-dev] Linker option to dump dependency graph
>>>>>> >>>> >
>>>>>> >>>> > Hello,
>>>>>> >>>> >
>>>>>> >>>> > I think outputting a dependency graph is a good idea and would
>>>>>> enable
>>>>>> >>>> > some offline analysis. I think that there is some advantage to
>>>>>> >>>> > building some of the simpler ones in, particularly those that
>>>>>> would
>>>>>> >>>> > need heavy annotations to the dependency graph, in particular
>>>>>> unless
>>>>>> >>>> > we write a sample analysis tool that ships with the release,
>>>>>> many
>>>>>> >>>> > users are going to miss out on useful features as they aren't
>>>>>> going to
>>>>>> >>>> > have the time to build one. I've put some comments inline:
>>>>>> >>>> >
>>>>>> >>>> > On Wed, 27 Feb 2019 at 00:31, Michael Spencer via llvm-dev
>>>>>> >>>> > <llvm-dev at lists.llvm.org> wrote:
>>>>>> >>>> > >
>>>>>> >>>> > > On Tue, Feb 26, 2019 at 4:06 PM Rui Ueyama <ruiu at google.com>
>>>>>> wrote:
>>>>>> >>>> > >>
>>>>>> >>>> > >> On Tue, Feb 26, 2019 at 3:31 PM Michael Spencer
>>>>>> >>>> > <bigcheesegs at gmail.com> wrote:
>>>>>> >>>> > >>>
>>>>>> >>>> > >>> On Tue, Feb 26, 2019 at 2:23 PM Rui Ueyama via llvm-dev
>>>>>> <llvm-
>>>>>> >>>> > dev at lists.llvm.org> wrote:
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>> Hi,
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>> I've heard people say that they want to analyze
>>>>>> dependencies between
>>>>>> >>>> > object files at the linker level so that they can run a
>>>>>> whole-program analysis
>>>>>> >>>> > which cannot be done at the compiler that works for one
>>>>>> compilation unit at
>>>>>> >>>> > a time. I'd like to start a discussion as to what we can do
>>>>>> with it and how to
>>>>>> >>>> > make it possible. I'm also sharing my idea about how to make
>>>>>> it possible.
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>> Dependency analyses
>>>>>> >>>> > >>>> First, let me start with a few examples of analyses I'm
>>>>>> heard of or
>>>>>> >>>> > thinking about. Dependencies between object files can be
>>>>>> represented as a
>>>>>> >>>> > graph where vertices are input sections and edges are symbols
>>>>>> and
>>>>>> >>>> > relocations. Analyses would work on the dependency graph.
>>>>>> Examples of
>>>>>> >>>> > analyses include but not limited to the following:
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>>  - Figure out why some library or an object file gets
>>>>>> linked.
>>>>>> >>>> > >>>>
>>>>>> >>>> >
>>>>>> >>>> > Arm's proprietary linker has a very helpful feature in verbose
>>>>>> mode
>>>>>> >>>> > where it will report on object loading: global/weak
>>>>>> definitions and
>>>>>> >>>> > global/weak references. For libraries you'd get a message like
>>>>>> >>>> > selecting member.o from library.a to define symbol S. This
>>>>>> resulted in
>>>>>> >>>> > quite an effective trace of the linker output that could
>>>>>> answer most
>>>>>> >>>> > "why did this library and object file get loaded question?"
>>>>>> One thing
>>>>>> >>>> > a dependency graph might not capture is the order in which
>>>>>> events
>>>>>> >>>> > occur, this can be very useful when debugging problems caused
>>>>>> by
>>>>>> >>>> > library selection order.
>>>>>> >>>> >
>>>>>> >>>> > >>>>  - Finding a candidate to eliminate dependency by finding
>>>>>> a "weak" link
>>>>>> >>>> > to a library. We can for example say the dependency to a
>>>>>> library is weak if
>>>>>> >>>> > the library in the graph can be unreachable if we remove N
>>>>>> edges from the
>>>>>> >>>> > graph (which is likely to correspond to removing N function
>>>>>> calls from the
>>>>>> >>>> > code), where N is a small number.
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>>  - Understanding which of new dependencies increase the
>>>>>> executable
>>>>>> >>>> > size the most, compare to a previous build.
>>>>>> >>>> > >>>>
>>>>>> >>>> >
>>>>>> >>>> > Arm's linker, being focused on embedded systems has a useful
>>>>>> feature
>>>>>> >>>> > that summarises the amount of content taken from each object
>>>>>> broken
>>>>>> >>>> > down into code, ro-data, rw-date etc. This can be helpful in
>>>>>> the face
>>>>>> >>>> > of comdat group elimination and optimisations such as garbage
>>>>>> >>>> > collection and ICF that can be difficult to predict from a
>>>>>> dependency
>>>>>> >>>> > graph. It is true that this information could be added as
>>>>>> attributes
>>>>>> >>>> > but again it may just be easier to write a simple analysis
>>>>>> pass over
>>>>>> >>>> > the output in the linker.
>>>>>> >>>> >
>>>>>> >>>> > >>>>  - Finding bad or circular dependencies between
>>>>>> sub-components.
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>> There would be many more analyses you want to run at the
>>>>>> linker input
>>>>>> >>>> > level. Currently, lld doesn't actively support such analyses.
>>>>>> There are a few
>>>>>> >>>> > options to make the linker emit dependency information (e.g.
>>>>>> --cref or -Map),
>>>>>> >>>> > but the output of the options is not comprehensive; you cannot
>>>>>> reconstruct a
>>>>>> >>>> > dependency graph from the output of the options.
>>>>>> >>>> >
>>>>>> >>>> >
>>>>>> >>>> >
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>> Dumping dependency graph
>>>>>> >>>> > >>>> So, I'm thinking if it would be desirable to add a new
>>>>>> feature to the
>>>>>> >>>> > linker to dump an entire dependency graph in such a way that a
>>>>>> graph can be
>>>>>> >>>> > reconstructed by reading it back. Once we have such feature,
>>>>>> we can link a
>>>>>> >>>> > program with the feature enabled and run any kind of
>>>>>> dependency analysis
>>>>>> >>>> > on the output. You can save dumps to compare to previous
>>>>>> builds. You can
>>>>>> >>>> > run any number of analyses on a dump, instead of invoking the
>>>>>> linker for
>>>>>> >>>> > each analysis.
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>> I don't have a concrete idea about the file output
>>>>>> format, but I believe
>>>>>> >>>> > it is essentially enough to emit triplets of (<from input
>>>>>> section>, <symbol>,
>>>>>> >>>> > <to input section>), which represents an edge, to reconstruct
>>>>>> a graph.
>>>>>> >>>> > >>>>
>>>>>> >>>> > >>>> Thoughts?
>>>>>> >>>> > >>>
>>>>>> >>>> > >>>
>>>>>> >>>> > >>> Back when I worked on the linker I pretty much always had
>>>>>> a way to
>>>>>> >>>> > dump a graphviz dot file to look at things.  Pretty much every
>>>>>> graph
>>>>>> >>>> > library/tool can read dot files, and they are easy to hack up
>>>>>> a parser for.  You
>>>>>> >>>> > can also add attributes to nodes and edges to store arbitrary
>>>>>> data.
>>>>>> >>>> > >>
>>>>>> >>>> > >>
>>>>>> >>>> > >> That's an interesting idea.
>>>>>> >>>> > >>
>>>>>> >>>> > >>> As for what to put it in, it really depends on how
>>>>>> detailed it needs to be.
>>>>>> >>>> > Should symbols and sections be collapsed together?  Should it
>>>>>> include
>>>>>> >>>> > relocation types? Symbol types/binding/size/etc?
>>>>>> >>>> > >>
>>>>>> >>>> > >>
>>>>>> >>>> > >>  Maybe everything? We can for example emit all symbols and
>>>>>> input
>>>>>> >>>> > sections first, and then emit a graph as the second half of
>>>>>> the output. E.g.
>>>>>> >>>> > >>
>>>>>> >>>> > >> Symbols:
>>>>>> >>>> > >>   <list of symbols>
>>>>>> >>>> > >> Sections:
>>>>>> >>>> > >>   <list of sections>
>>>>>> >>>> > >> Graph:
>>>>>> >>>> > >>  1 2 3  // 1st section depends on 3rd section via 2nd symbol
>>>>>> >>>> > >>  5 1 4  // likewise
>>>>>> >>>> > >
>>>>>> >>>> > >
>>>>>> >>>> > > I suppose it's a question of if we want users to need to
>>>>>> also read the inputs
>>>>>> >>>> > if they want things like section size and other section/symbol
>>>>>> attributes.  It
>>>>>> >>>> > would be pretty trivial to include that data as long as we
>>>>>> have a
>>>>>> >>>> > format/syntax for it.
>>>>>> >>>> > >
>>>>>> >>>> > > dot supports listing nodes first with attributes and then
>>>>>> referring to them by
>>>>>> >>>> > name later when listing edges.
>>>>>> >>>> > >
>>>>>> >>>> > > - Michael Spencer
>>>>>> >>>> > >
>>>>>> >>>> >
>>>>>> >>>> > I've experimented with dot files for this type of thing in the
>>>>>> past.
>>>>>> >>>> > The difficulty is that they get too large to be realistically
>>>>>> viewed
>>>>>> >>>> > very quickly. At that point you need to write scripts to
>>>>>> process the
>>>>>> >>>> > output and in that case you may as well use JSON or XML, which
>>>>>> I guess
>>>>>> >>>> > could easily be processed into dot files. To summarise, I
>>>>>> think we may
>>>>>> >>>> > be able to do quite well with some very simple extra analysis
>>>>>> in LLD,
>>>>>> >>>> > a machine readable dependency graph would also be very useful
>>>>>> for the
>>>>>> >>>> > more complex cases.
>>>>>> >>>> >
>>>>>> >>>> > Peter
>>>>>> >>>> >
>>>>>> >>>> >  _______________________________________________
>>>>>> >>>> > > LLVM Developers mailing list
>>>>>> >>>> > > llvm-dev at lists.llvm.org
>>>>>> >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>> >>>> > _______________________________________________
>>>>>> >>>> > LLVM Developers mailing list
>>>>>> >>>> > llvm-dev at lists.llvm.org
>>>>>> >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>> >>>> _______________________________________________
>>>>>> >>>> LLVM Developers mailing list
>>>>>> >>>> llvm-dev at lists.llvm.org
>>>>>> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>> >>
>>>>>> >> _______________________________________________
>>>>>> >> LLVM Developers mailing list
>>>>>> >> llvm-dev at lists.llvm.org
>>>>>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > --
>>>>>> > Peter
>>>>>> > _______________________________________________
>>>>>> > LLVM Developers mailing list
>>>>>> > llvm-dev at lists.llvm.org
>>>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> 宋方睿
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>
> --
> 宋方睿
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201022/6d28096f/attachment.html>