<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Oct 8, 2015 at 9:33 PM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span class="">
<br>
<br>
<div>On 09.10.2015 4:08, Rui Ueyama wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Wed, Oct 7, 2015 at 10:55 PM, Igor
Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> I've added some
examples to show the differences between your patch and
GNU ld's behavior, see <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>.
For me, it's not just the different way to implement the
switch, but mostly change its meaning.</div>
</blockquote>
<div><br>
</div>
<div>It is indeed different from GNU, but I think our
behavior makes sense and I don't think that's a bad
difference. LLD's symbol handling is already pretty
different in some area, particularly how we handle archive
files and --start-group/--end-group. Just like that, this
is something we don't want to aim the exact compatibility.</div>
</div>
</div>
</div>
</blockquote>
<br></span>
Any unnecessary difference may lead to unexpected problems for end
users. According to this, I don't feel like accepting your approach,
which, compared to ld's behavior, is different in almost all aspects
I can imagine, whereas my original patch follows ld's and gold's
conduct pretty well. Maybe it's better not implement this switch at
all.<br></div></blockquote><div><br></div><div>I'd agree. Maybe we want some implementation of --wrap in future, but probably it's too early. We can do that later after getting better idea what end users would say.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
Concerning the symbol table, the new mechanic of symbol handling is
indeed different, but it implies the same rules of symbol resolution
and seems like it can imitate GNU's behavior someday, even with
--(start|end)-group.</div></blockquote><div><br></div><div>I doubt that we would want to imitate --start-group/end-group behavior. That seems a hack to work around Unix linker's keep-only-undefined-symbols semantics, and we don't need that hack at all by design. Since large programs such as Clang can be happily linked with the new LLD, I believe we don't have to imitate --start-group/end-group.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><div><div class="h5"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>
<div><br>
<br>
<div>On 08.10.2015 02:12, Rui Ueyama wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Wed, Oct 7, 2015
at 10:28 AM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
They have quite a good explanation of
the option's behavior and both ld and
gold produce the same result. The option
should only change the resolution target
of the specified symbols and shouldn't
do anything else; otherwise, the
generated result will be unpredictable.
In particular, we should neither change
the name of a defined symbol nor remove
it.<br>
<br>
OK, I can suggest a bit different
approach. What if we add a field to the
Symbol class to redirect undefined
symbol bodies to another Symbol? It'll
require a small update in resolve() and
maybe in a few other places, but the
additional hash map will be gone. What
do you think?</div>
</blockquote>
<div><br>
</div>
<div>We may be overthinking about that. I
don't think that the difference between
LLD and GNU ld is going to be a real issue
in real-world usage, and even it does, we
can do whatever to fix that after we
recognize the need. So I think something
like <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>
suffices.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<div> <br>
<div>On 07.10.2015 22:38, Rui Ueyama
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">We are not
necessarily aim 100% precise
compatibility with GNU ld in
every detail. GNU ld may have
implemented this feature in a
way that's natural to them. I'd
really want to implement this
feature in a way as the LLD
architecture is designed for.
This --wrap option is by nature
a bit hacky option, and users
need to understand what they are
doing. GNU ld manual does not
mention about the details about
how this option is supposed to
work.<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed,
Oct 7, 2015 at 9:12 AM, Igor
Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<div> <br>
<br>
<div>On 07.10.2015
20:22, Rui Ueyama
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On
Wed, Oct 7,
2015 at 6:59
AM, Igor
Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">ikudrin
added a
comment.<br>
<span><br>
In <a href="http://reviews.llvm.org/D13501#261600" target="_blank"></a><a href="http://reviews.llvm.org/D13501#261600" target="_blank">http://reviews.llvm.org/D13501#261600</a>,
@ruiu wrote:<br>
<br>
> This
patch needs
redesigning
because we
don't want to
look up hash
tables more
than once for
each symbol.
In this patch,
names for
undefined
symbols are
looked up
twice -- once
in the
InputFile.cpp
and the other
in
SymbolTable.cpp.<br>
<br>
<br>
</span>We have
to look up for
different
names for
defined and
undefined
symbols, so we
can't use just
one hash map.<br>
<br>
In most cases,
when the -wrap
switch is not
used and
UndefSymNameReplacement
is empty,
addition
lookup will be
very cheap,
without
calculating a
hash value at
all. On the
other hand, we
can use just
std::map which
expected to
work really
quick in our
case.<br>
<br>
</blockquote>
<div><br>
</div>
<div>No, we
don't have to
look up hash
table more
than once. I
thought a bit
more about
this and
noticed that
the LLD
architecture
(the
separation of
Symbol and
SymbolBody)
would really
play well
here. Symbols
are just
pointers to
SymbolBodies,
and symbol
renaming is as
easy as single
pointer
mutation.
Here's the
idea.</div>
<div><br>
</div>
<div>First we
resolve all
symbols
without
considering
--wrap option
(so no
overhead for
--wrap during
file parsing
and symbol
resolution).
When symbol
resolution is
done, we
basically have
three Symbols
for --wrap'ed
symbol S in
the symbol
table: S,
__wrap_S, and
__real_S. Let
X, Y and Z be
their
SymbolBodies,
respectively.</div>
<div><br>
</div>
<div>Originally
the
relationships
are</div>
<div><br>
</div>
<div> S ->
X</div>
<div>
__wrap_S ->
Y</div>
<div>
__real_S ->
Z</div>
<div><br>
</div>
<div>We swap
the pointers
so that their
relationships
are</div>
<div><br>
</div>
<div> S ->
Y</div>
<div>
__wrap_S ->
Y</div>
<div>
__real_S ->
X</div>
<div><br>
</div>
<div>Now you
can see that
all symbols
that would
have been
resolved to S
without --wrap
are now
resolved as if
they were
__wrap_S.
__real_S is
also properly
resolved to
the symbol
that was once
the real
symbol pointed
to. This is
the same
result as what
--wrap
expects.</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
Thank you for explaining
your idea. It can work in
some cases, but I dread
it's unable to cover all
situations.<br>
<br>
1) Suppose we have all
three symbols defined. The
command like "ld.lld2 a.o
-wrap=S" in your case will
eliminate Z and change
content for symbols S and
__real_S (see
SymbolTableSection<ELFT>::writeGlobalSymbols).
That's not right, GNU's ld
preserves symbols in this
case.<br>
<br>
2) Suppose that S and
__wrap_S are defined in
different object files,
which lay in archives.
With switch "-wrap=S" (and
if we have a reference to
S) we should not get the
content of the first
object file at all, but
how can we avoid parsing
it if there is the record
'S' in the SymbolTable?<br>
<br>
3) If we generate DSO and
have an unresolved
undefined symbol S we have
to rename it and store as
__wrap_S. It'll require
more complicated update of
symbols in the symbol
table to support this
case.<br>
<br>
Finally, my approach, I
hope, preserves the
original idea of
relationships in the
symbol table. With my
case, we will have links
like these:<br>
(Body for undefined S,
name is "__wrap_S") ->
(Symbol with name
"__wrap_S") -> (Body
for defined symbol
"__wrap_S")<br>
(Body for defined
__wrap_S) -> (Symbol
with name "__wrap_S")
-> (Body for defined
symbol "__wrap_S")<br>
(Body for defined S) ->
(Symbol with name "S")
-> (Body for defined
symbol "S")<br>
<br>
With your proposal we'll
end up with links like:<br>
(Body for undefined S,
name is "S") -> (Symbol
with name "S") -> (Body
for defined symbol
"__wrap_S")<br>
(Body for defined
__wrap_S) -> (Symbol
with name "__wrap_S")
-> (Body for defined
symbol "__wrap_S")<br>
(Body for defined S) ->
(Symbol with name "S")
-> (Body for defined
symbol "__wrap_S")<br>
<br>
The last line is a bit
unexpected and can be a
source for unwitting bugs.<br>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br></div></div>