[llvm] r199244 - Reapply "LTO: add API to set strategy for -internalize"

Wed Apr 2 13:11:30 PDT 2014

On Apr 2, 2014, at 12:02, Rafael Espíndola <rafael.espindola at gmail.com> wrote:

>>> If I understand it correctly, it
>>> exist so that we don't internalize symbols when using ld -r.
>> 
>> Right.
>> 
>>  - INTERNALIZE_HIDDEN is used by ld -r
>>  - INTERNALIZE_NONE is used by ld -r -keep_private_externs
>> 
>> I expect INTERNALIZE_NONE is generally useful for creating shared objects, where
>> nothing should be internalized.  I'm not sure if there are other use cases for
>> INTERNALIZE_HIDDEN.
>> 
>>> Why can't
>>> the linker simply list the symbols that should not be internalized in
>>> that case as it does for every other file?
>> 
>> For INTERNALIZE_NONE, that requires listing every symbol in the bitcode.  For
>> INTERNALIZE_HIDDEN, it would be on the same scale (listing every symbol that
>> isn't hidden).  The LTO side would then be dealing with extremely large sets of
>> symbols to export.
>> 
>> Adding API was more efficient and easier to reason about.
>> 
>> Is it causing a problem, or are you generally concerned about C API bloat?
> 
> API bloat in general. What I like about not having special cases like
> this is that it makes it far more obvious which symbols the linker
> wants: only the ones it explicitly asks for.

I dispute that.  I think the linker can communicate more clearly with LTO
*with* the API change, and I don't think it's less obvious which symbols the
linker wants.

> The the flag in contrast
> requires a C api addition and an extension to the internalize api.

You're right here; maybe the API change was the wrong direction (especially
given your profiling results).  I'll CC you directly for LTO patches in the
future.

But isn't the C API stable?  Isn't it too late to roll back this change?

> Last time there was some concern about the number of calls to preserve
> symbols I did some benchmarking
> (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130930/189746.html)
> 
> ------------------------------
> I took the
> largest "shared library" I had at hand: clang with -export_dynamic. On
> linux gold currently asks for 22922 symbols to be preserved. I applied
> the attached hack-timeit.patch to test how long it takes for us to get
> the internalized combined module. On my machine (2010 iMac),  the time
> is 0.08s, so despite large number of calls and the
> std::map<std::string>, internalize in general is not too slow. For
> scale, the verifier takes about 1s. Parsing the bitcode file takes 5s
> (the file is 118 MB).
> --------------------------------
> 
> Have you found a case where the extra calls are actually noticeable?

Probably not.  Since my approach directly modelled what the linker wants, and
certainly wasn't going to be slower, I never profiled.  (Sorry I missed that
that conversation; I arrived too late.)

Nick, did you do any profiling of this?