[llvm-dev] RFC: Moving the module summary into the irsymtab

Mehdi AMINI via llvm-dev llvm-dev at lists.llvm.org
Mon May 1 11:06:52 PDT 2017


2017-04-25 12:11 GMT-07:00 Peter Collingbourne <peter at pcc.me.uk>:

> Hi all,
>
> I've been making a number of changes to the summary representation
> recently, and I wanted to lay out some of my plans so that folks are aware
> of my ultimate direction with this.
>
> Basically I want to move the summary into the irsymtab that we will be
> storing to disk after D32061 lands. This would help solve a number of
> problems:
> - To read a summary, you need to read all summaries in a module. For
> example, if a client only wants to read summaries for prevailing symbols,
> it still needs to read summaries for all symbols. We should be able to
> design an API that lets clients avoid reading summaries for known
> non-prevailing symbols.
>

Why should we? Efficiency?
Where in the process would we benefit from that?
Right now we may still cross-module import and inline a non-prevailing
symbols, if the prevailing symbol isn't IR defined.



> - Regular LTO modules do not have summaries. This means that dead
> stripping is less effective if the module contains both regular and thin
> LTO modules. (Although this is a somewhat orthogonal problem, since I am
> making a format change, I'd like to take care of it as part of the same
> change.)
>

How do you want do deal with that?
Do you mind giving a quick example (LTO defines foo, used from bar in a
ThinLTO module, but bar is dead).

Also I assume code size isn't an issue since -ffunction-sections always
allow to dead-strip post-LTO (even though we're losing compile time
efficiency and potentially a few optimisations).


>
> Basically, the summaries would be stored in an auxiliary structure like
> storage::Uncommon with a flag in the storage::Symbol indicating whether a
> given symbol has a summary.
>
> Currently we use the presence of a summary to indicate whether to compile
> a module with regular or thin LTO. This will need to change if we want to
> store summaries for regular LTO modules. To that end, I want to add a
> record to all bitcode modules to be compiled with thin LTO that marks them
> as such. This will be used in place of the presence of the summary. For
> backwards compatibility, the presence of a summary in bitcode format will
> be used to mark modules as needing to be compiled with thin LTO.
>
> Because the contents of the summaries, unlike the irsymtab, for the most
> part do not need to be accurate for correctness,
>

I'm not sure what you mean about accuracy and correctness. The "for the
most part" part especially is raising the question that if there is a part
that needs to be accurate for correctness, that's enough to requires
accuracy for the summary period.



> I think we don't need as strict rules as we do for the rest of the
> irsymtab. I.e. we don't need to rebuild the summary entirely if the LLVM
> revision changes, as I am doing for the irsymtab in D32061. However, the
> summary must have a correct set of reference edges in order to implement
> dead stripping.
>

Note that in the Apple ecosystem, we claim backward compatibility and want
ThinLTO static archive built with 4.0 to work seamlessly with 5.0, 6.0,
etc.



> So the solution I have in mind is to pessimise dead stripping *for that
> module* if the LLVM revision is out of date. I.e. the upgraded summary
> would contain a reference edge from every defined symbol to every other
> symbol in the module. Because we had already regenerated the irsymtab as a
> result of the revision change, we will have an accurate symbol table for
> the module, so this seems sound to me.
>

> All other parts of the summary would be preserved, as long as the summary
> format does not change. This means that the function size and hotness for
> example would be copied from the existing summary. To make this work, I
> would add a format version number field to the irsymtab header as part of
> D32061, and copy the remaining information from the existing summary as
> long as the version number is the same.
>

The last two paragraphs aren't totally clear to me (on the why it is needed
and what is the practical impact).

Thanks,

-- 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170501/93746b84/attachment.html>


More information about the llvm-dev mailing list