<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Oct 7, 2015 at 10:55 PM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div text="#000000" bgcolor="#FFFFFF">

    I've added some examples to show the differences between your patch

    and GNU ld's behavior, see <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>.

    For me, it's not just the different way to implement the switch, but

    mostly change its meaning.</div></blockquote><div><br></div><div>It is indeed different from GNU, but I think our behavior makes sense and I don't think that's a bad difference. LLD's symbol handling is already pretty different in some area, particularly how we handle archive files and --start-group/--end-group. Just like that, this is something we don't want to aim the exact compatibility.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><div class="h5"><br>

    <br>

    <div>On 08.10.2015 02:12, Rui Ueyama wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">On Wed, Oct 7, 2015 at 10:28 AM, Igor

            Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"> They have quite a

                good explanation of the option's behavior and both ld

                and gold produce the same result. The option should only

                change the resolution target of the specified symbols

                and shouldn't do anything else; otherwise, the generated

                result will be unpredictable. In particular, we should

                neither change the name of a defined symbol nor remove

                it.<br>

                <br>

                OK, I can suggest a bit different approach. What if we

                add a field to the Symbol class to redirect undefined

                symbol bodies to another Symbol? It'll require a small

                update in resolve() and maybe in a few other places, but

                the additional hash map will be gone. What do you think?</div>

            </blockquote>

            <div><br>

            </div>

            <div>We may be overthinking about that. I don't think that

              the difference between LLD and GNU ld is going to be a

              real issue in real-world usage, and even it does, we can

              do whatever to fix that after we recognize the need. So I

              think something like <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>

              suffices.</div>

            <div> </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000">

                <div>

                  <div> <br>

                    <div>On 07.10.2015 22:38, Rui Ueyama wrote:<br>

                    </div>

                    <blockquote type="cite">

                      <div dir="ltr">We are not necessarily aim 100%

                        precise compatibility with GNU ld in every

                        detail. GNU ld may have implemented this feature

                        in a way that's natural to them. I'd really want

                        to implement this feature in a way as the LLD

                        architecture is designed for. This --wrap option

                        is by nature a bit hacky option, and users need

                        to understand what they are doing. GNU ld manual

                        does not mention about the details about how

                        this option is supposed to work.<br>

                      </div>

                      <div class="gmail_extra"><br>

                        <div class="gmail_quote">On Wed, Oct 7, 2015 at

                          9:12 AM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>

                          wrote:<br>

                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                            <div bgcolor="#FFFFFF" text="#000000">

                              <div>

                                <div> <br>

                                  <br>

                                  <div>On 07.10.2015 20:22, Rui Ueyama

                                    wrote:<br>

                                  </div>

                                  <blockquote type="cite">

                                    <div dir="ltr">

                                      <div class="gmail_extra">

                                        <div class="gmail_quote">On Wed,

                                          Oct 7, 2015 at 6:59 AM, Igor

                                          Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>

                                          wrote:<br>

                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">ikudrin

                                            added a comment.<br>

                                            <span><br>

                                              In <a href="http://reviews.llvm.org/D13501#261600" rel="noreferrer" target="_blank"></a><a href="http://reviews.llvm.org/D13501#261600" target="_blank">http://reviews.llvm.org/D13501#261600</a>,

                                              @ruiu wrote:<br>

                                              <br>

                                              > This patch needs

                                              redesigning because we

                                              don't want to look up hash

                                              tables more than once for

                                              each symbol. In this

                                              patch, names for undefined

                                              symbols are looked up

                                              twice -- once in the

                                              InputFile.cpp and the

                                              other in SymbolTable.cpp.<br>

                                              <br>

                                              <br>

                                            </span>We have to look up

                                            for different names for

                                            defined and undefined

                                            symbols, so we can't use

                                            just one hash map.<br>

                                            <br>

                                            In most cases, when the

                                            -wrap switch is not used and

                                            UndefSymNameReplacement is

                                            empty, addition lookup will

                                            be very cheap, without

                                            calculating a hash value at

                                            all. On the other hand, we

                                            can use just std::map which

                                            expected to work really

                                            quick in our case.<br>

                                            <br>

                                          </blockquote>

                                          <div><br>

                                          </div>

                                          <div>No, we don't have to look

                                            up hash table more than

                                            once. I thought a bit more

                                            about this and noticed that

                                            the LLD architecture (the

                                            separation of Symbol and

                                            SymbolBody) would really

                                            play well here. Symbols are

                                            just pointers to

                                            SymbolBodies, and symbol

                                            renaming is as easy as

                                            single pointer mutation.

                                            Here's the idea.</div>

                                          <div><br>

                                          </div>

                                          <div>First we resolve all

                                            symbols without considering

                                            --wrap option (so no

                                            overhead for --wrap during

                                            file parsing and symbol

                                            resolution). When symbol

                                            resolution is done, we

                                            basically have three Symbols

                                            for --wrap'ed symbol S in

                                            the symbol table: S,

                                            __wrap_S, and __real_S. Let

                                            X, Y and Z be their

                                            SymbolBodies, respectively.</div>

                                          <div><br>

                                          </div>

                                          <div>Originally the

                                            relationships are</div>

                                          <div><br>

                                          </div>

                                          <div>  S -> X</div>

                                          <div>  __wrap_S -> Y</div>

                                          <div>  __real_S -> Z</div>

                                          <div><br>

                                          </div>

                                          <div>We swap the pointers so

                                            that their relationships are</div>

                                          <div><br>

                                          </div>

                                          <div>  S -> Y</div>

                                          <div>  __wrap_S -> Y</div>

                                          <div>  __real_S -> X</div>

                                          <div><br>

                                          </div>

                                          <div>Now you can see that all

                                            symbols that would have been

                                            resolved to S without --wrap

                                            are now resolved as if they

                                            were __wrap_S. __real_S is

                                            also properly resolved to

                                            the symbol that was once the

                                            real symbol pointed to. This

                                            is the same result as what

                                            --wrap expects.</div>

                                        </div>

                                      </div>

                                    </div>

                                  </blockquote>

                                  <br>

                                </div>

                              </div>

                              Thank you for explaining your idea. It can

                              work in some cases, but I dread it's

                              unable to cover all situations.<br>

                              <br>

                              1) Suppose we have all three symbols

                              defined. The command like "ld.lld2 a.o

                              -wrap=S" in your case will eliminate Z and

                              change content for symbols S and __real_S

                              (see

                              SymbolTableSection<ELFT>::writeGlobalSymbols).

                              That's not right, GNU's ld preserves

                              symbols in this case.<br>

                              <br>

                              2) Suppose that S and __wrap_S are defined

                              in different object files, which lay in

                              archives. With switch "-wrap=S" (and if we

                              have a reference to S) we should not get

                              the content of the first object file at

                              all, but how can we avoid parsing it if

                              there is the record 'S' in the

                              SymbolTable?<br>

                              <br>

                              3) If we generate DSO and have an

                              unresolved undefined symbol S we have to

                              rename it and store as __wrap_S. It'll

                              require more complicated update of symbols

                              in the symbol table to support this case.<br>

                              <br>

                              Finally, my approach, I hope, preserves

                              the original idea of relationships in the

                              symbol table. With my case, we will have

                              links like these:<br>

                              (Body for undefined S, name is "__wrap_S")

                              -> (Symbol with name "__wrap_S") ->

                              (Body for defined symbol "__wrap_S")<br>

                              (Body for defined __wrap_S) -> (Symbol

                              with name "__wrap_S") -> (Body for

                              defined symbol "__wrap_S")<br>

                              (Body for defined S) -> (Symbol with

                              name "S") -> (Body for defined symbol

                              "S")<br>

                              <br>

                              With your proposal we'll end up with links

                              like:<br>

                              (Body for undefined S, name is "S") ->

                              (Symbol with name "S") -> (Body for

                              defined symbol "__wrap_S")<br>

                              (Body for defined __wrap_S) -> (Symbol

                              with name "__wrap_S") -> (Body for

                              defined symbol "__wrap_S")<br>

                              (Body for defined S) -> (Symbol with

                              name "S") -> (Body for defined symbol

                              "__wrap_S")<br>

                              <br>

                              The last line is a bit unexpected and can

                              be a source for unwitting bugs.<br>

                              <br>

                            </div>

                          </blockquote>

                        </div>

                        <br>

                      </div>

                    </blockquote>

                    <br>

                  </div>

                </div>

              </div>

            </blockquote>

          </div>

          <br>

        </div>

      </div>

    </blockquote>

    <br>

  </div></div></div>

</blockquote></div><br></div></div>