<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Revert in 189386.  Once again, I apologize  I don't follow the

    canonical procedure. <br>

    I personally think Nick's proposal is clean enough for our system,

    and take for granted <br>

    the community will like it.<br>

    <br>

    I will not initiate a discussion for now. I'd like to cool things

    down for a while. (maybe postpone indefinitely). <br>

    <br>

    As with most infrastructure related project, partition is an

    unglamorous and pain-taking work. <br>

    I step forward to take it just because we are almost have no way

    debug or investigate LTO. <br>

    <br>

    For those who is curious about how much we can speedup by partition.

    Unfortunately, I can't tell<br>

    as the project is not yet completely done. My rudimentary (quite

    stupid actually) <br>

    implementation using make-utility speedup the command "clang++

    Xalancbmk/*.o -flto"<br>

    by 39%. (35s vs 21s, Xalancbmk has 700+ input).  It is bit shame for

    partition. But at very least, each partition <br>

    is under human control.  On the other hand,  post-IPO

    scalar-optimization is not yet parallelizied<br>

    in my rudimentary implementation. (i.e. so far only parallelize the

    codegen part). Surprisingly, <br>

    the result is very consistent with what Xiaofei achieve via

    multh-threading code-gen.  As far <br>

    as I can recall, he speedup some 2.9x. In my case, it take about 13s

    before code-gen starts.<br>

    Meaning the speedup to the code-gen is about (35-13)/(21-13) =

    2.75x.  <br>

    (Code-gen plus linker's post-processing take 35-13s).<br>

    <br>

    <div class="moz-cite-prefix">On 8/27/13 12:27 AM, Shuxin Yang wrote:<br>

    </div>

    <blockquote cite="mid:521C54EB.8090803@gmail.com" type="cite">

      <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

      <div class="moz-cite-prefix">On 8/26/13 11:19 PM, Chandler Carruth

        wrote:<br>

      </div>

      <blockquote

cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"

        type="cite">

        <div dir="ltr">On Mon, Aug 26, 2013 at 5:53 PM, Shuxin Yang <span

            dir="ltr"><<a moz-do-not-send="true"

              href="mailto:shuxin.llvm@gmail.com" target="_blank"

              class="cremed">shuxin.llvm@gmail.com</a>></span> wrote:<br>

          <div class="gmail_extra">

            <div class="gmail_quote">

              <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">We

                certainly need a way to feed multiple resulting objects

                back to linker.  There are couple of ways<br>

                for this end:<br>

                <br>

                   1) 'ld -r all-resulting-obj-on-disk -o result.o"  and

                feed the only object file (i.e. the result.o)<br>

                       back to linker<br>

                <br>

                    2) keep the resulting objects in memory buffer, and

                feedback to buffers back to linker<br>

                        (as proposed by Nick)<br>

                <br>

                    3) As with GNU gold,  save the resulting objects on

                disk, and feed the these disk files back to linker<br>

                one by one.<br>

                <br>

                    I'm big linker nut. I don't know which way work

                better.  I try to use 1) as a workaround for the time

                being<br>

                before 2) is available. People at Apple disagree my

                engineering approach.<br>

                <br>

                    From compiler's perspective,<br>

                    o. 1) is not just workaround, 3) is certainly better

                than 1).<br>

                    o. 2) will win if the program being compiled is

                small- or medium-sized.<br>

                        With huge programs,  it will be difficult for

                compiler to decide when and how to "spill" some stuff<br>

                        from memory to disk.  Folks in Apple iterate and

                reiterate we only consider the case that the entire<br>

                       program can be loaded in memory. So, the added

                difficulty for compiler dose not seems to be a<br>

                       problem for the workload we care about.</blockquote>

              <div><br>

              </div>

              <div>

                <div>Shuxin, I'm not sure what you're trying to

                  accomplish here, but I don't think this is the right

                  approach.</div>

                <div><br>

                </div>

                <div>First, you seem to be pursuing a partitioning

                  scheme for parallelizing LTO work despite *no*

                  consensus that this is the correct approach </div>

              </div>

            </div>

          </div>

        </div>

      </blockquote>

      I sent a proposal long time ago, as far as I can understand from

      the mailing list. There is no objection at all. <br>

      Actually, but my approach is not new at all. It is almost a "std"

      way to perform partition. It looks similar to all LTOs I

      worked/played before.<br>

      It just need some LLVM flavor.  But this change has nothing to do

      the partition implementation, it just add a interface. <br>

      <br>

      <blockquote

cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"

        type="cite">

        <div dir="ltr">

          <div class="gmail_extra">

            <div class="gmail_quote">

              <div>

                <div>in any of the community discussions I can find.

                  Please don't commit code toward a design that the

                  community has expressed serious reservations about

                  without review.</div>

                <div><br>

                </div>

                <div>Second, you are committing a new API to the set of

                  the stable C APIs that libLTO exposes without a

                  thorough discussion on the mailing list. </div>

              </div>

            </div>

          </div>

        </div>

      </blockquote>

      Sorry, I thought this is pretty Apple thing, as no other system

      use this API. <br>

      I will revert tomorrow, and initiate a discussion. <br>

      <br>

      The APIs are almost divided into two classes. One for Unix+gold,

      the other one for OSX + Apple LD.<br>

      I don't like the way it is, and I don't like the such APIs at all

      (I mean all of them). <br>

       I used to argue we are better off having a symbol-related

      interface instead of LTO-related API.<br>

       But the community dose not buy my point.  As I have little

      knowledge about LLVM, I have to keep <br>

      open mind, and adapter to LLVM-thinking, but it certainly take

      some time. <br>

      <br>

      <blockquote

cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"

        type="cite">

        <div dir="ltr">

          <div class="gmail_extra">

            <div class="gmail_quote">

              <div>

                <div>It is possible I have missed this discussion, but I

                  did look and failed to find anything that seems to

                  resemble a review, much less an LGTM. If I have missed

                  it, I apologize and please direct me at the thread. I

                  bring this up because the specific interface seems

                  surprising to me.</div>

                <div><br>

                </div>

                <div>Third, you are justifying the particular approach

                  with a deflection to some discussion within Apple or

                  with those developers you work with at Apple. While

                  this may in fact be the motivation for this patch, the

                  open source community is often not party to these

                  discussions. ;] </div>

              </div>

            </div>

          </div>

        </div>

      </blockquote>

      That is true:-)<br>

      <br>

      <blockquote

cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"

        type="cite">

        <div dir="ltr">

          <div class="gmail_extra">

            <div class="gmail_quote">

              <div>

                <div>It would help us if you would just give the

                  specific basis rather than referencing a discussion

                  that we weren't involved with. As it happens, I

                  suspect I agree with these "Folks in Apple" that it is

                  useful to specifically optimize for the case that an

                  entire program fits into memory, bypassing the

                  filesystem. </div>

              </div>

            </div>

          </div>

        </div>

      </blockquote>

      You bet!. <br>

      <br>

      I debate with them. No chance to win. Why don't you suspect in the

      first place:-). <br>

      But "folks in Apple" argue that is plan in the future.  It dose

      not seems to be pretty lame argument, <br>

      as current implement of LTO bring everything in memory. <br>

      <br>

      | However, there are many paths to that end result. From the

      little information in the commit log there isn't really enough to

      tell why *this* is the necessary path forward (in fact, I'm

      somewhat confident it isn't).<br>

      <br>

      In concept, there is only one alternative : compile the the merged

      module into multiple objects, and feed the object back to linker.

      <br>

        <br>

      <br>

      <blockquote

cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"

        type="cite">

        <div dir="ltr">

          <div class="gmail_extra">

            <div class="gmail_quote">

              <div> </div>

              <div><br>

              </div>

              <div><br>

              </div>

              <div>So, to get back to Eric's original question: what is

                the motivation for this API, it's expected actual usage,

                and the reason why it is important to stub out in this

                way now? </div>

            </div>

          </div>

        </div>

      </blockquote>

      The motivation is: the existing LTO compile the merged module into

      *single* object, <br>

        with this new API, it enable the way to compile merged module

      into *multiple* objects. <br>

        I'm wondering if this is clear now.  <br>

      <br>

         for instance, suppose the command line is "clang -flto a.o b.bc

      c.o d.bc" (*.o is real object, and *.bc are bitcode), <br>

        existing LTO will merge b.bc and d.dc into t.bc (merged module),

      LTO will compile the merged t.bc into t.o, <br>

      and feed the t.o back the linker which combine a.o c.o t.o into

      a.out. <br>

      <br>

         The new API will trigger the compiler convert t.o into p1.o and

      p2.o ...., and feed these p*.o back to linker, which <br>

        combine a.o and c.o into a.out. <br>

       <br>

        <br>

      <br>

      <br>

      <br>

      <br>

      <br>

      <blockquote

cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"

        type="cite">

        <div dir="ltr">

          <div class="gmail_extra">

            <div class="gmail_quote">

              <div>Better yet, could we have that discussion before

                growing the set of stable APIs that we claim to never

                regress?</div>

            </div>

          </div>

        </div>

      </blockquote>

      <br>

      Sure. Sorry about that. I actually don't what to touch the

      lto_xxx() API for now.  I just want to do some workaround <br>

      on the limitation on the linker, and wait for new ld. But Bob

      didn't buy my argument:-).<br>

      <br>

      <br>

    </blockquote>

    <br>

  </body>

</html>