[llvm-dev] RFC: Moving the module summary into the irsymtab

Peter Collingbourne via llvm-dev llvm-dev at lists.llvm.org
Tue Apr 25 12:11:47 PDT 2017


Hi all,

I've been making a number of changes to the summary representation
recently, and I wanted to lay out some of my plans so that folks are aware
of my ultimate direction with this.

Basically I want to move the summary into the irsymtab that we will be
storing to disk after D32061 lands. This would help solve a number of
problems:
- To read a summary, you need to read all summaries in a module. For
example, if a client only wants to read summaries for prevailing symbols,
it still needs to read summaries for all symbols. We should be able to
design an API that lets clients avoid reading summaries for known
non-prevailing symbols.
- Regular LTO modules do not have summaries. This means that dead stripping
is less effective if the module contains both regular and thin LTO modules.
(Although this is a somewhat orthogonal problem, since I am making a format
change, I'd like to take care of it as part of the same change.)

Basically, the summaries would be stored in an auxiliary structure like
storage::Uncommon with a flag in the storage::Symbol indicating whether a
given symbol has a summary.

Currently we use the presence of a summary to indicate whether to compile a
module with regular or thin LTO. This will need to change if we want to
store summaries for regular LTO modules. To that end, I want to add a
record to all bitcode modules to be compiled with thin LTO that marks them
as such. This will be used in place of the presence of the summary. For
backwards compatibility, the presence of a summary in bitcode format will
be used to mark modules as needing to be compiled with thin LTO.

Because the contents of the summaries, unlike the irsymtab, for the most
part do not need to be accurate for correctness, I think we don't need as
strict rules as we do for the rest of the irsymtab. I.e. we don't need to
rebuild the summary entirely if the LLVM revision changes, as I am doing
for the irsymtab in D32061. However, the summary must have a correct set of
reference edges in order to implement dead stripping.

So the solution I have in mind is to pessimise dead stripping *for that
module* if the LLVM revision is out of date. I.e. the upgraded summary
would contain a reference edge from every defined symbol to every other
symbol in the module. Because we had already regenerated the irsymtab as a
result of the revision change, we will have an accurate symbol table for
the module, so this seems sound to me.

All other parts of the summary would be preserved, as long as the summary
format does not change. This means that the function size and hotness for
example would be copied from the existing summary. To make this work, I
would add a format version number field to the irsymtab header as part of
D32061, and copy the remaining information from the existing summary as
long as the version number is the same.

Does that seem reasonable?

Thanks,
-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170425/410bef3c/attachment.html>


More information about the llvm-dev mailing list