<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Oct 8, 2015 at 9:33 PM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000"><span class="">

    <br>

    <br>

    <div>On 09.10.2015 4:08, Rui Ueyama wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">On Wed, Oct 7, 2015 at 10:55 PM, Igor

            Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF"> I've added some

                examples to show the differences between your patch and

                GNU ld's behavior, see <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>.

                For me, it's not just the different way to implement the

                switch, but mostly change its meaning.</div>

            </blockquote>

            <div><br>

            </div>

            <div>It is indeed different from GNU, but I think our

              behavior makes sense and I don't think that's a bad

              difference. LLD's symbol handling is already pretty

              different in some area, particularly how we handle archive

              files and --start-group/--end-group. Just like that, this

              is something we don't want to aim the exact compatibility.</div>

          </div>

        </div>

      </div>

    </blockquote>

    <br></span>

    Any unnecessary difference may lead to unexpected problems for end

    users. According to this, I don't feel like accepting your approach,

    which, compared to ld's behavior, is different in almost all aspects

    I can imagine, whereas my original patch follows ld's and gold's

    conduct pretty well. Maybe it's better not implement this switch at

    all.<br></div></blockquote><div><br></div><div>I'd agree. Maybe we want some implementation of --wrap in future, but probably it's too early. We can do that later after getting better idea what end users would say.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    Concerning the symbol table, the new mechanic of symbol handling is

    indeed different, but it implies the same rules of symbol resolution

    and seems like it can imitate GNU's behavior someday, even with

    --(start|end)-group.</div></blockquote><div><br></div><div>I doubt that we would want to imitate --start-group/end-group behavior. That seems a hack to work around Unix linker's keep-only-undefined-symbols semantics, and we don't need that hack at all by design. Since large programs such as Clang can be happily linked with the new LLD, I believe we don't have to imitate --start-group/end-group.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><div><div class="h5"><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF">

                <div>

                  <div><br>

                    <br>

                    <div>On 08.10.2015 02:12, Rui Ueyama wrote:<br>

                    </div>

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div class="gmail_extra">

                          <div class="gmail_quote">On Wed, Oct 7, 2015

                            at 10:28 AM, Igor Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>

                            wrote:<br>

                            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                              <div bgcolor="#FFFFFF" text="#000000">

                                They have quite a good explanation of

                                the option's behavior and both ld and

                                gold produce the same result. The option

                                should only change the resolution target

                                of the specified symbols and shouldn't

                                do anything else; otherwise, the

                                generated result will be unpredictable.

                                In particular, we should neither change

                                the name of a defined symbol nor remove

                                it.<br>

                                <br>

                                OK, I can suggest a bit different

                                approach. What if we add a field to the

                                Symbol class to redirect undefined

                                symbol bodies to another Symbol? It'll

                                require a small update in resolve() and

                                maybe in a few other places, but the

                                additional hash map will be gone. What

                                do you think?</div>

                            </blockquote>

                            <div><br>

                            </div>

                            <div>We may be overthinking about that. I

                              don't think that the difference between

                              LLD and GNU ld is going to be a real issue

                              in real-world usage, and even it does, we

                              can do whatever to fix that after we

                              recognize the need. So I think something

                              like <a href="http://reviews.llvm.org/D13528" target="_blank">http://reviews.llvm.org/D13528</a>

                              suffices.</div>

                            <div> </div>

                            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                              <div bgcolor="#FFFFFF" text="#000000">

                                <div>

                                  <div> <br>

                                    <div>On 07.10.2015 22:38, Rui Ueyama

                                      wrote:<br>

                                    </div>

                                    <blockquote type="cite">

                                      <div dir="ltr">We are not

                                        necessarily aim 100% precise

                                        compatibility with GNU ld in

                                        every detail. GNU ld may have

                                        implemented this feature in a

                                        way that's natural to them. I'd

                                        really want to implement this

                                        feature in a way as the LLD

                                        architecture is designed for.

                                        This --wrap option is by nature

                                        a bit hacky option, and users

                                        need to understand what they are

                                        doing. GNU ld manual does not

                                        mention about the details about

                                        how this option is supposed to

                                        work.<br>

                                      </div>

                                      <div class="gmail_extra"><br>

                                        <div class="gmail_quote">On Wed,

                                          Oct 7, 2015 at 9:12 AM, Igor

                                          Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>

                                          wrote:<br>

                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                            <div bgcolor="#FFFFFF" text="#000000">

                                              <div>

                                                <div> <br>

                                                  <br>

                                                  <div>On 07.10.2015

                                                    20:22, Rui Ueyama

                                                    wrote:<br>

                                                  </div>

                                                  <blockquote type="cite">

                                                    <div dir="ltr">

                                                      <div class="gmail_extra">

                                                        <div class="gmail_quote">On

                                                          Wed, Oct 7,

                                                          2015 at 6:59

                                                          AM, Igor

                                                          Kudrin <span dir="ltr"><<a href="mailto:ikudrin.dev@gmail.com" target="_blank"></a><a href="mailto:ikudrin.dev@gmail.com" target="_blank">ikudrin.dev@gmail.com</a>></span>

                                                          wrote:<br>

                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">ikudrin

                                                          added a

                                                          comment.<br>

                                                          <span><br>

                                                          In <a href="http://reviews.llvm.org/D13501#261600" target="_blank"></a><a href="http://reviews.llvm.org/D13501#261600" target="_blank">http://reviews.llvm.org/D13501#261600</a>,

                                                          @ruiu wrote:<br>

                                                          <br>

                                                          > This

                                                          patch needs

                                                          redesigning

                                                          because we

                                                          don't want to

                                                          look up hash

                                                          tables more

                                                          than once for

                                                          each symbol.

                                                          In this patch,

                                                          names for

                                                          undefined

                                                          symbols are

                                                          looked up

                                                          twice -- once

                                                          in the

                                                          InputFile.cpp

                                                          and the other

                                                          in

                                                          SymbolTable.cpp.<br>

                                                          <br>

                                                          <br>

                                                          </span>We have

                                                          to look up for

                                                          different

                                                          names for

                                                          defined and

                                                          undefined

                                                          symbols, so we

                                                          can't use just

                                                          one hash map.<br>

                                                          <br>

                                                          In most cases,

                                                          when the -wrap

                                                          switch is not

                                                          used and

                                                          UndefSymNameReplacement

                                                          is empty,

                                                          addition

                                                          lookup will be

                                                          very cheap,

                                                          without

                                                          calculating a

                                                          hash value at

                                                          all. On the

                                                          other hand, we

                                                          can use just

                                                          std::map which

                                                          expected to

                                                          work really

                                                          quick in our

                                                          case.<br>

                                                          <br>

                                                          </blockquote>

                                                          <div><br>

                                                          </div>

                                                          <div>No, we

                                                          don't have to

                                                          look up hash

                                                          table more

                                                          than once. I

                                                          thought a bit

                                                          more about

                                                          this and

                                                          noticed that

                                                          the LLD

                                                          architecture

                                                          (the

                                                          separation of

                                                          Symbol and

                                                          SymbolBody)

                                                          would really

                                                          play well

                                                          here. Symbols

                                                          are just

                                                          pointers to

                                                          SymbolBodies,

                                                          and symbol

                                                          renaming is as

                                                          easy as single

                                                          pointer

                                                          mutation.

                                                          Here's the

                                                          idea.</div>

                                                          <div><br>

                                                          </div>

                                                          <div>First we

                                                          resolve all

                                                          symbols

                                                          without

                                                          considering

                                                          --wrap option

                                                          (so no

                                                          overhead for

                                                          --wrap during

                                                          file parsing

                                                          and symbol

                                                          resolution).

                                                          When symbol

                                                          resolution is

                                                          done, we

                                                          basically have

                                                          three Symbols

                                                          for --wrap'ed

                                                          symbol S in

                                                          the symbol

                                                          table: S,

                                                          __wrap_S, and

                                                          __real_S. Let

                                                          X, Y and Z be

                                                          their

                                                          SymbolBodies,

                                                          respectively.</div>

                                                          <div><br>

                                                          </div>

                                                          <div>Originally

                                                          the

                                                          relationships

                                                          are</div>

                                                          <div><br>

                                                          </div>

                                                          <div>  S ->

                                                          X</div>

                                                          <div> 

                                                          __wrap_S ->

                                                          Y</div>

                                                          <div> 

                                                          __real_S ->

                                                          Z</div>

                                                          <div><br>

                                                          </div>

                                                          <div>We swap

                                                          the pointers

                                                          so that their

                                                          relationships

                                                          are</div>

                                                          <div><br>

                                                          </div>

                                                          <div>  S ->

                                                          Y</div>

                                                          <div> 

                                                          __wrap_S ->

                                                          Y</div>

                                                          <div> 

                                                          __real_S ->

                                                          X</div>

                                                          <div><br>

                                                          </div>

                                                          <div>Now you

                                                          can see that

                                                          all symbols

                                                          that would

                                                          have been

                                                          resolved to S

                                                          without --wrap

                                                          are now

                                                          resolved as if

                                                          they were

                                                          __wrap_S.

                                                          __real_S is

                                                          also properly

                                                          resolved to

                                                          the symbol

                                                          that was once

                                                          the real

                                                          symbol pointed

                                                          to. This is

                                                          the same

                                                          result as what

                                                          --wrap

                                                          expects.</div>

                                                        </div>

                                                      </div>

                                                    </div>

                                                  </blockquote>

                                                  <br>

                                                </div>

                                              </div>

                                              Thank you for explaining

                                              your idea. It can work in

                                              some cases, but I dread

                                              it's unable to cover all

                                              situations.<br>

                                              <br>

                                              1) Suppose we have all

                                              three symbols defined. The

                                              command like "ld.lld2 a.o

                                              -wrap=S" in your case will

                                              eliminate Z and change

                                              content for symbols S and

                                              __real_S (see

                                              SymbolTableSection<ELFT>::writeGlobalSymbols).

                                              That's not right, GNU's ld

                                              preserves symbols in this

                                              case.<br>

                                              <br>

                                              2) Suppose that S and

                                              __wrap_S are defined in

                                              different object files,

                                              which lay in archives.

                                              With switch "-wrap=S" (and

                                              if we have a reference to

                                              S) we should not get the

                                              content of the first

                                              object file at all, but

                                              how can we avoid parsing

                                              it if there is the record

                                              'S' in the SymbolTable?<br>

                                              <br>

                                              3) If we generate DSO and

                                              have an unresolved

                                              undefined symbol S we have

                                              to rename it and store as

                                              __wrap_S. It'll require

                                              more complicated update of

                                              symbols in the symbol

                                              table to support this

                                              case.<br>

                                              <br>

                                              Finally, my approach, I

                                              hope, preserves the

                                              original idea of

                                              relationships in the

                                              symbol table. With my

                                              case, we will have links

                                              like these:<br>

                                              (Body for undefined S,

                                              name is "__wrap_S") ->

                                              (Symbol with name

                                              "__wrap_S") -> (Body

                                              for defined symbol

                                              "__wrap_S")<br>

                                              (Body for defined

                                              __wrap_S) -> (Symbol

                                              with name "__wrap_S")

                                              -> (Body for defined

                                              symbol "__wrap_S")<br>

                                              (Body for defined S) ->

                                              (Symbol with name "S")

                                              -> (Body for defined

                                              symbol "S")<br>

                                              <br>

                                              With your proposal we'll

                                              end up with links like:<br>

                                              (Body for undefined S,

                                              name is "S") -> (Symbol

                                              with name "S") -> (Body

                                              for defined symbol

                                              "__wrap_S")<br>

                                              (Body for defined

                                              __wrap_S) -> (Symbol

                                              with name "__wrap_S")

                                              -> (Body for defined

                                              symbol "__wrap_S")<br>

                                              (Body for defined S) ->

                                              (Symbol with name "S")

                                              -> (Body for defined

                                              symbol "__wrap_S")<br>

                                              <br>

                                              The last line is a bit

                                              unexpected and can be a

                                              source for unwitting bugs.<br>

                                              <br>

                                            </div>

                                          </blockquote>

                                        </div>

                                        <br>

                                      </div>

                                    </blockquote>

                                    <br>

                                  </div>

                                </div>

                              </div>

                            </blockquote>

                          </div>

                          <br>

                        </div>

                      </div>

                    </blockquote>

                    <br>

                  </div>

                </div>

              </div>

            </blockquote>

          </div>

          <br>

        </div>

      </div>

    </blockquote>

    <br>

  </div></div></div>

</blockquote></div><br></div></div>