<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Oct 8, 2015 at 9:33 PM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000"><span class="">
    <br>
    <br>
    <div>On 09.10.2015 4:08, Rui Ueyama wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">On Wed, Oct 7, 2015 at 10:55 PM, Igor
            Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"> I've added some
                examples to show the differences between your patch and
                GNU ld's behavior, see <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>.
                For me, it's not just the different way to implement the
                switch, but mostly change its meaning.</div>
            </blockquote>
            <div><br>
            </div>
            <div>It is indeed different from GNU, but I think our
              behavior makes sense and I don't think that's a bad
              difference. LLD's symbol handling is already pretty
              different in some area, particularly how we handle archive
              files and --start-group/--end-group. Just like that, this
              is something we don't want to aim the exact compatibility.</div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></span>
    Any unnecessary difference may lead to unexpected problems for end
    users. According to this, I don't feel like accepting your approach,
    which, compared to ld's behavior, is different in almost all aspects
    I can imagine, whereas my original patch follows ld's and gold's
    conduct pretty well. Maybe it's better not implement this switch at
    all.<br></div></blockquote><div><br></div><div>I'd agree. Maybe we want some implementation of --wrap in future, but probably it's too early. We can do that later after getting better idea what end users would say.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
    Concerning the symbol table, the new mechanic of symbol handling is
    indeed different, but it implies the same rules of symbol resolution
    and seems like it can imitate GNU's behavior someday, even with
    --(start|end)-group.</div></blockquote><div><br></div><div>I doubt that we would want to imitate --start-group/end-group behavior. That seems a hack to work around Unix linker's keep-only-undefined-symbols semantics, and we don't need that hack at all by design. Since large programs such as Clang can be happily linked with the new LLD, I believe we don't have to imitate --start-group/end-group.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><div><div class="h5"><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF">
                <div>
                  <div><br>
                    <br>
                    <div>On 08.10.2015 02:12, Rui Ueyama wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div class="gmail_extra">
                          <div class="gmail_quote">On Wed, Oct 7, 2015
                            at 10:28 AM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
                            wrote:<br>
                            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                              <div bgcolor="#FFFFFF" text="#000000">
                                They have quite a good explanation of
                                the option's behavior and both ld and
                                gold produce the same result. The option
                                should only change the resolution target
                                of the specified symbols and shouldn't
                                do anything else; otherwise, the
                                generated result will be unpredictable.
                                In particular, we should neither change
                                the name of a defined symbol nor remove
                                it.<br>
                                <br>
                                OK, I can suggest a bit different
                                approach. What if we add a field to the
                                Symbol class to redirect undefined
                                symbol bodies to another Symbol? It'll
                                require a small update in resolve() and
                                maybe in a few other places, but the
                                additional hash map will be gone. What
                                do you think?</div>
                            </blockquote>
                            <div><br>
                            </div>
                            <div>We may be overthinking about that. I
                              don't think that the difference between
                              LLD and GNU ld is going to be a real issue
                              in real-world usage, and even it does, we
                              can do whatever to fix that after we
                              recognize the need. So I think something
                              like <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>
                              suffices.</div>
                            <div> </div>
                            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                              <div bgcolor="#FFFFFF" text="#000000">
                                <div>
                                  <div> <br>
                                    <div>On 07.10.2015 22:38, Rui Ueyama
                                      wrote:<br>
                                    </div>
                                    <blockquote type="cite">
                                      <div dir="ltr">We are not
                                        necessarily aim 100% precise
                                        compatibility with GNU ld in
                                        every detail. GNU ld may have
                                        implemented this feature in a
                                        way that's natural to them. I'd
                                        really want to implement this
                                        feature in a way as the LLD
                                        architecture is designed for.
                                        This --wrap option is by nature
                                        a bit hacky option, and users
                                        need to understand what they are
                                        doing. GNU ld manual does not
                                        mention about the details about
                                        how this option is supposed to
                                        work.<br>
                                      </div>
                                      <div class="gmail_extra"><br>
                                        <div class="gmail_quote">On Wed,
                                          Oct 7, 2015 at 9:12 AM, Igor
                                          Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
                                          wrote:<br>
                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                                            <div bgcolor="#FFFFFF" text="#000000">
                                              <div>
                                                <div> <br>
                                                  <br>
                                                  <div>On 07.10.2015
                                                    20:22, Rui Ueyama
                                                    wrote:<br>
                                                  </div>
                                                  <blockquote type="cite">
                                                    <div dir="ltr">
                                                      <div class="gmail_extra">
                                                        <div class="gmail_quote">On
                                                          Wed, Oct 7,
                                                          2015 at 6:59
                                                          AM, Igor
                                                          Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">ikudrin



                                                          added a
                                                          comment.<br>
                                                          <span><br>
                                                          In <a href="http://reviews.llvm.org/D13501#261600" target="_blank"></a><a href="http://reviews.llvm.org/D13501#261600" target="_blank">http://reviews.llvm.org/D13501#261600</a>,
                                                          @ruiu wrote:<br>
                                                          <br>
                                                          > This
                                                          patch needs
                                                          redesigning
                                                          because we
                                                          don't want to
                                                          look up hash
                                                          tables more
                                                          than once for
                                                          each symbol.
                                                          In this patch,
                                                          names for
                                                          undefined
                                                          symbols are
                                                          looked up
                                                          twice -- once
                                                          in the
                                                          InputFile.cpp
                                                          and the other
                                                          in
                                                          SymbolTable.cpp.<br>
                                                          <br>
                                                          <br>
                                                          </span>We have
                                                          to look up for
                                                          different
                                                          names for
                                                          defined and
                                                          undefined
                                                          symbols, so we
                                                          can't use just
                                                          one hash map.<br>
                                                          <br>
                                                          In most cases,
                                                          when the -wrap
                                                          switch is not
                                                          used and
                                                          UndefSymNameReplacement
                                                          is empty,
                                                          addition
                                                          lookup will be
                                                          very cheap,
                                                          without
                                                          calculating a
                                                          hash value at
                                                          all. On the
                                                          other hand, we
                                                          can use just
                                                          std::map which
                                                          expected to
                                                          work really
                                                          quick in our
                                                          case.<br>
                                                          <br>
                                                          </blockquote>
                                                          <div><br>
                                                          </div>
                                                          <div>No, we
                                                          don't have to
                                                          look up hash
                                                          table more
                                                          than once. I
                                                          thought a bit
                                                          more about
                                                          this and
                                                          noticed that
                                                          the LLD
                                                          architecture
                                                          (the
                                                          separation of
                                                          Symbol and
                                                          SymbolBody)
                                                          would really
                                                          play well
                                                          here. Symbols
                                                          are just
                                                          pointers to
                                                          SymbolBodies,
                                                          and symbol
                                                          renaming is as
                                                          easy as single
                                                          pointer
                                                          mutation.
                                                          Here's the
                                                          idea.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>First we
                                                          resolve all
                                                          symbols
                                                          without
                                                          considering
                                                          --wrap option
                                                          (so no
                                                          overhead for
                                                          --wrap during
                                                          file parsing
                                                          and symbol
                                                          resolution).
                                                          When symbol
                                                          resolution is
                                                          done, we
                                                          basically have
                                                          three Symbols
                                                          for --wrap'ed
                                                          symbol S in
                                                          the symbol
                                                          table: S,
                                                          __wrap_S, and
                                                          __real_S. Let
                                                          X, Y and Z be
                                                          their
                                                          SymbolBodies,
                                                          respectively.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Originally
                                                          the
                                                          relationships
                                                          are</div>
                                                          <div><br>
                                                          </div>
                                                          <div>  S ->
                                                          X</div>
                                                          <div> 
                                                          __wrap_S ->
                                                          Y</div>
                                                          <div> 
                                                          __real_S ->
                                                          Z</div>
                                                          <div><br>
                                                          </div>
                                                          <div>We swap
                                                          the pointers
                                                          so that their
                                                          relationships
                                                          are</div>
                                                          <div><br>
                                                          </div>
                                                          <div>  S ->
                                                          Y</div>
                                                          <div> 
                                                          __wrap_S ->
                                                          Y</div>
                                                          <div> 
                                                          __real_S ->
                                                          X</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Now you
                                                          can see that
                                                          all symbols
                                                          that would
                                                          have been
                                                          resolved to S
                                                          without --wrap
                                                          are now
                                                          resolved as if
                                                          they were
                                                          __wrap_S.
                                                          __real_S is
                                                          also properly
                                                          resolved to
                                                          the symbol
                                                          that was once
                                                          the real
                                                          symbol pointed
                                                          to. This is
                                                          the same
                                                          result as what
                                                          --wrap
                                                          expects.</div>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                  <br>
                                                </div>
                                              </div>
                                              Thank you for explaining
                                              your idea. It can work in
                                              some cases, but I dread
                                              it's unable to cover all
                                              situations.<br>
                                              <br>
                                              1) Suppose we have all
                                              three symbols defined. The
                                              command like "ld.lld2 a.o
                                              -wrap=S" in your case will
                                              eliminate Z and change
                                              content for symbols S and
                                              __real_S (see
                                              SymbolTableSection<ELFT>::writeGlobalSymbols).
                                              That's not right, GNU's ld
                                              preserves symbols in this
                                              case.<br>
                                              <br>
                                              2) Suppose that S and
                                              __wrap_S are defined in
                                              different object files,
                                              which lay in archives.
                                              With switch "-wrap=S" (and
                                              if we have a reference to
                                              S) we should not get the
                                              content of the first
                                              object file at all, but
                                              how can we avoid parsing
                                              it if there is the record
                                              'S' in the SymbolTable?<br>
                                              <br>
                                              3) If we generate DSO and
                                              have an unresolved
                                              undefined symbol S we have
                                              to rename it and store as
                                              __wrap_S. It'll require
                                              more complicated update of
                                              symbols in the symbol
                                              table to support this
                                              case.<br>
                                              <br>
                                              Finally, my approach, I
                                              hope, preserves the
                                              original idea of
                                              relationships in the
                                              symbol table. With my
                                              case, we will have links
                                              like these:<br>
                                              (Body for undefined S,
                                              name is "__wrap_S") ->
                                              (Symbol with name
                                              "__wrap_S") -> (Body
                                              for defined symbol
                                              "__wrap_S")<br>
                                              (Body for defined
                                              __wrap_S) -> (Symbol
                                              with name "__wrap_S")
                                              -> (Body for defined
                                              symbol "__wrap_S")<br>
                                              (Body for defined S) ->
                                              (Symbol with name "S")
                                              -> (Body for defined
                                              symbol "S")<br>
                                              <br>
                                              With your proposal we'll
                                              end up with links like:<br>
                                              (Body for undefined S,
                                              name is "S") -> (Symbol
                                              with name "S") -> (Body
                                              for defined symbol
                                              "__wrap_S")<br>
                                              (Body for defined
                                              __wrap_S) -> (Symbol
                                              with name "__wrap_S")
                                              -> (Body for defined
                                              symbol "__wrap_S")<br>
                                              (Body for defined S) ->
                                              (Symbol with name "S")
                                              -> (Body for defined
                                              symbol "__wrap_S")<br>
                                              <br>
                                              The last line is a bit
                                              unexpected and can be a
                                              source for unwitting bugs.<br>
                                              <br>
                                            </div>
                                          </blockquote>
                                        </div>
                                        <br>
                                      </div>
                                    </blockquote>
                                    <br>
                                  </div>
                                </div>
                              </div>
                            </blockquote>
                          </div>
                          <br>
                        </div>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div></div>