[PATCH] D13501: [ELF2] Add -wrap switch

Wed Oct 7 22:55:09 PDT 2015

I've added some examples to show the differences between your patch and 
GNU ld's behavior, see http://reviews.llvm.org/D13528. For me, it's not 
just the different way to implement the switch, but mostly change its 
meaning.

On 08.10.2015 02:12, Rui Ueyama wrote:
> On Wed, Oct 7, 2015 at 10:28 AM, Igor Kudrin <ikudrin.dev at gmail.com 
> <mailto:ikudrin.dev at gmail.com>> wrote:
>
>     They have quite a good explanation of the option's behavior and
>     both ld and gold produce the same result. The option should only
>     change the resolution target of the specified symbols and
>     shouldn't do anything else; otherwise, the generated result will
>     be unpredictable. In particular, we should neither change the name
>     of a defined symbol nor remove it.
>
>     OK, I can suggest a bit different approach. What if we add a field
>     to the Symbol class to redirect undefined symbol bodies to another
>     Symbol? It'll require a small update in resolve() and maybe in a
>     few other places, but the additional hash map will be gone. What
>     do you think?
>
>
> We may be overthinking about that. I don't think that the difference 
> between LLD and GNU ld is going to be a real issue in real-world 
> usage, and even it does, we can do whatever to fix that after we 
> recognize the need. So I think something like 
> http://reviews.llvm.org/D13528 suffices.
>
>
>     On 07.10.2015 22:38, Rui Ueyama wrote:
>>     We are not necessarily aim 100% precise compatibility with GNU ld
>>     in every detail. GNU ld may have implemented this feature in a
>>     way that's natural to them. I'd really want to implement this
>>     feature in a way as the LLD architecture is designed for. This
>>     --wrap option is by nature a bit hacky option, and users need to
>>     understand what they are doing. GNU ld manual does not mention
>>     about the details about how this option is supposed to work.
>>
>>     On Wed, Oct 7, 2015 at 9:12 AM, Igor Kudrin
>>     <ikudrin.dev at gmail.com <mailto:ikudrin.dev at gmail.com>> wrote:
>>
>>
>>
>>         On 07.10.2015 20:22, Rui Ueyama wrote:
>>>         On Wed, Oct 7, 2015 at 6:59 AM, Igor Kudrin
>>>         <ikudrin.dev at gmail.com <mailto:ikudrin.dev at gmail.com>> wrote:
>>>
>>>             ikudrin added a comment.
>>>
>>>             In http://reviews.llvm.org/D13501#261600, @ruiu wrote:
>>>
>>>             > This patch needs redesigning because we don't want to
>>>             look up hash tables more than once for each symbol. In
>>>             this patch, names for undefined symbols are looked up
>>>             twice -- once in the InputFile.cpp and the other in
>>>             SymbolTable.cpp.
>>>
>>>
>>>             We have to look up for different names for defined and
>>>             undefined symbols, so we can't use just one hash map.
>>>
>>>             In most cases, when the -wrap switch is not used and
>>>             UndefSymNameReplacement is empty, addition lookup will
>>>             be very cheap, without calculating a hash value at all.
>>>             On the other hand, we can use just std::map which
>>>             expected to work really quick in our case.
>>>
>>>
>>>         No, we don't have to look up hash table more than once. I
>>>         thought a bit more about this and noticed that the LLD
>>>         architecture (the separation of Symbol and SymbolBody) would
>>>         really play well here. Symbols are just pointers to
>>>         SymbolBodies, and symbol renaming is as easy as single
>>>         pointer mutation. Here's the idea.
>>>
>>>         First we resolve all symbols without considering --wrap
>>>         option (so no overhead for --wrap during file parsing and
>>>         symbol resolution). When symbol resolution is done, we
>>>         basically have three Symbols for --wrap'ed symbol S in the
>>>         symbol table: S, __wrap_S, and __real_S. Let X, Y and Z be
>>>         their SymbolBodies, respectively.
>>>
>>>         Originally the relationships are
>>>
>>>           S -> X
>>>           __wrap_S -> Y
>>>           __real_S -> Z
>>>
>>>         We swap the pointers so that their relationships are
>>>
>>>           S -> Y
>>>           __wrap_S -> Y
>>>           __real_S -> X
>>>
>>>         Now you can see that all symbols that would have been
>>>         resolved to S without --wrap are now resolved as if they
>>>         were __wrap_S. __real_S is also properly resolved to the
>>>         symbol that was once the real symbol pointed to. This is the
>>>         same result as what --wrap expects.
>>
>>         Thank you for explaining your idea. It can work in some
>>         cases, but I dread it's unable to cover all situations.
>>
>>         1) Suppose we have all three symbols defined. The command
>>         like "ld.lld2 a.o -wrap=S" in your case will eliminate Z and
>>         change content for symbols S and __real_S (see
>>         SymbolTableSection<ELFT>::writeGlobalSymbols). That's not
>>         right, GNU's ld preserves symbols in this case.
>>
>>         2) Suppose that S and __wrap_S are defined in different
>>         object files, which lay in archives. With switch "-wrap=S"
>>         (and if we have a reference to S) we should not get the
>>         content of the first object file at all, but how can we avoid
>>         parsing it if there is the record 'S' in the SymbolTable?
>>
>>         3) If we generate DSO and have an unresolved undefined symbol
>>         S we have to rename it and store as __wrap_S. It'll require
>>         more complicated update of symbols in the symbol table to
>>         support this case.
>>
>>         Finally, my approach, I hope, preserves the original idea of
>>         relationships in the symbol table. With my case, we will have
>>         links like these:
>>         (Body for undefined S, name is "__wrap_S") -> (Symbol with
>>         name "__wrap_S") -> (Body for defined symbol "__wrap_S")
>>         (Body for defined __wrap_S) -> (Symbol with name "__wrap_S")
>>         -> (Body for defined symbol "__wrap_S")
>>         (Body for defined S) -> (Symbol with name "S") -> (Body for
>>         defined symbol "S")
>>
>>         With your proposal we'll end up with links like:
>>         (Body for undefined S, name is "S") -> (Symbol with name "S")
>>         -> (Body for defined symbol "__wrap_S")
>>         (Body for defined __wrap_S) -> (Symbol with name "__wrap_S")
>>         -> (Body for defined symbol "__wrap_S")
>>         (Body for defined S) -> (Symbol with name "S") -> (Body for
>>         defined symbol "__wrap_S")
>>
>>         The last line is a bit unexpected and can be a source for
>>         unwitting bugs.
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151008/a3e33959/attachment.html>