<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Oct 7, 2015 at 10:55 PM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    I've added some examples to show the differences between your patch
    and GNU ld's behavior, see <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>.
    For me, it's not just the different way to implement the switch, but
    mostly change its meaning.</div></blockquote><div><br></div><div>It is indeed different from GNU, but I think our behavior makes sense and I don't think that's a bad difference. LLD's symbol handling is already pretty different in some area, particularly how we handle archive files and --start-group/--end-group. Just like that, this is something we don't want to aim the exact compatibility.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><div class="h5"><br>
    <br>
    <div>On 08.10.2015 02:12, Rui Ueyama wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">On Wed, Oct 7, 2015 at 10:28 AM, Igor
            Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"> They have quite a
                good explanation of the option's behavior and both ld
                and gold produce the same result. The option should only
                change the resolution target of the specified symbols
                and shouldn't do anything else; otherwise, the generated
                result will be unpredictable. In particular, we should
                neither change the name of a defined symbol nor remove
                it.<br>
                <br>
                OK, I can suggest a bit different approach. What if we
                add a field to the Symbol class to redirect undefined
                symbol bodies to another Symbol? It'll require a small
                update in resolve() and maybe in a few other places, but
                the additional hash map will be gone. What do you think?</div>
            </blockquote>
            <div><br>
            </div>
            <div>We may be overthinking about that. I don't think that
              the difference between LLD and GNU ld is going to be a
              real issue in real-world usage, and even it does, we can
              do whatever to fix that after we recognize the need. So I
              think something like <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>
              suffices.</div>
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div>
                  <div> <br>
                    <div>On 07.10.2015 22:38, Rui Ueyama wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr">We are not necessarily aim 100%
                        precise compatibility with GNU ld in every
                        detail. GNU ld may have implemented this feature
                        in a way that's natural to them. I'd really want
                        to implement this feature in a way as the LLD
                        architecture is designed for. This --wrap option
                        is by nature a bit hacky option, and users need
                        to understand what they are doing. GNU ld manual
                        does not mention about the details about how
                        this option is supposed to work.<br>
                      </div>
                      <div class="gmail_extra"><br>
                        <div class="gmail_quote">On Wed, Oct 7, 2015 at
                          9:12 AM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
                          wrote:<br>
                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                            <div bgcolor="#FFFFFF" text="#000000">
                              <div>
                                <div> <br>
                                  <br>
                                  <div>On 07.10.2015 20:22, Rui Ueyama
                                    wrote:<br>
                                  </div>
                                  <blockquote type="cite">
                                    <div dir="ltr">
                                      <div class="gmail_extra">
                                        <div class="gmail_quote">On Wed,
                                          Oct 7, 2015 at 6:59 AM, Igor
                                          Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
                                          wrote:<br>
                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">ikudrin


                                            added a comment.<br>
                                            <span><br>
                                              In <a href="http://reviews.llvm.org/D13501#261600" rel="noreferrer" target="_blank"></a><a href="http://reviews.llvm.org/D13501#261600" target="_blank">http://reviews.llvm.org/D13501#261600</a>,
                                              @ruiu wrote:<br>
                                              <br>
                                              > This patch needs
                                              redesigning because we
                                              don't want to look up hash
                                              tables more than once for
                                              each symbol. In this
                                              patch, names for undefined
                                              symbols are looked up
                                              twice -- once in the
                                              InputFile.cpp and the
                                              other in SymbolTable.cpp.<br>
                                              <br>
                                              <br>
                                            </span>We have to look up
                                            for different names for
                                            defined and undefined
                                            symbols, so we can't use
                                            just one hash map.<br>
                                            <br>
                                            In most cases, when the
                                            -wrap switch is not used and
                                            UndefSymNameReplacement is
                                            empty, addition lookup will
                                            be very cheap, without
                                            calculating a hash value at
                                            all. On the other hand, we
                                            can use just std::map which
                                            expected to work really
                                            quick in our case.<br>
                                            <br>
                                          </blockquote>
                                          <div><br>
                                          </div>
                                          <div>No, we don't have to look
                                            up hash table more than
                                            once. I thought a bit more
                                            about this and noticed that
                                            the LLD architecture (the
                                            separation of Symbol and
                                            SymbolBody) would really
                                            play well here. Symbols are
                                            just pointers to
                                            SymbolBodies, and symbol
                                            renaming is as easy as
                                            single pointer mutation.
                                            Here's the idea.</div>
                                          <div><br>
                                          </div>
                                          <div>First we resolve all
                                            symbols without considering
                                            --wrap option (so no
                                            overhead for --wrap during
                                            file parsing and symbol
                                            resolution). When symbol
                                            resolution is done, we
                                            basically have three Symbols
                                            for --wrap'ed symbol S in
                                            the symbol table: S,
                                            __wrap_S, and __real_S. Let
                                            X, Y and Z be their
                                            SymbolBodies, respectively.</div>
                                          <div><br>
                                          </div>
                                          <div>Originally the
                                            relationships are</div>
                                          <div><br>
                                          </div>
                                          <div>  S -> X</div>
                                          <div>  __wrap_S -> Y</div>
                                          <div>  __real_S -> Z</div>
                                          <div><br>
                                          </div>
                                          <div>We swap the pointers so
                                            that their relationships are</div>
                                          <div><br>
                                          </div>
                                          <div>  S -> Y</div>
                                          <div>  __wrap_S -> Y</div>
                                          <div>  __real_S -> X</div>
                                          <div><br>
                                          </div>
                                          <div>Now you can see that all
                                            symbols that would have been
                                            resolved to S without --wrap
                                            are now resolved as if they
                                            were __wrap_S. __real_S is
                                            also properly resolved to
                                            the symbol that was once the
                                            real symbol pointed to. This
                                            is the same result as what
                                            --wrap expects.</div>
                                        </div>
                                      </div>
                                    </div>
                                  </blockquote>
                                  <br>
                                </div>
                              </div>
                              Thank you for explaining your idea. It can
                              work in some cases, but I dread it's
                              unable to cover all situations.<br>
                              <br>
                              1) Suppose we have all three symbols
                              defined. The command like "ld.lld2 a.o
                              -wrap=S" in your case will eliminate Z and
                              change content for symbols S and __real_S
                              (see
                              SymbolTableSection<ELFT>::writeGlobalSymbols).
                              That's not right, GNU's ld preserves
                              symbols in this case.<br>
                              <br>
                              2) Suppose that S and __wrap_S are defined
                              in different object files, which lay in
                              archives. With switch "-wrap=S" (and if we
                              have a reference to S) we should not get
                              the content of the first object file at
                              all, but how can we avoid parsing it if
                              there is the record 'S' in the
                              SymbolTable?<br>
                              <br>
                              3) If we generate DSO and have an
                              unresolved undefined symbol S we have to
                              rename it and store as __wrap_S. It'll
                              require more complicated update of symbols
                              in the symbol table to support this case.<br>
                              <br>
                              Finally, my approach, I hope, preserves
                              the original idea of relationships in the
                              symbol table. With my case, we will have
                              links like these:<br>
                              (Body for undefined S, name is "__wrap_S")
                              -> (Symbol with name "__wrap_S") ->
                              (Body for defined symbol "__wrap_S")<br>
                              (Body for defined __wrap_S) -> (Symbol
                              with name "__wrap_S") -> (Body for
                              defined symbol "__wrap_S")<br>
                              (Body for defined S) -> (Symbol with
                              name "S") -> (Body for defined symbol
                              "S")<br>
                              <br>
                              With your proposal we'll end up with links
                              like:<br>
                              (Body for undefined S, name is "S") ->
                              (Symbol with name "S") -> (Body for
                              defined symbol "__wrap_S")<br>
                              (Body for defined __wrap_S) -> (Symbol
                              with name "__wrap_S") -> (Body for
                              defined symbol "__wrap_S")<br>
                              (Body for defined S) -> (Symbol with
                              name "S") -> (Body for defined symbol
                              "__wrap_S")<br>
                              <br>
                              The last line is a bit unexpected and can
                              be a source for unwitting bugs.<br>
                              <br>
                            </div>
                          </blockquote>
                        </div>
                        <br>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div></div>