<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Revert in 189386. Once again, I apologize I don't follow the
canonical procedure. <br>
I personally think Nick's proposal is clean enough for our system,
and take for granted <br>
the community will like it.<br>
<br>
I will not initiate a discussion for now. I'd like to cool things
down for a while. (maybe postpone indefinitely). <br>
<br>
As with most infrastructure related project, partition is an
unglamorous and pain-taking work. <br>
I step forward to take it just because we are almost have no way
debug or investigate LTO. <br>
<br>
For those who is curious about how much we can speedup by partition.
Unfortunately, I can't tell<br>
as the project is not yet completely done. My rudimentary (quite
stupid actually) <br>
implementation using make-utility speedup the command "clang++
Xalancbmk/*.o -flto"<br>
by 39%. (35s vs 21s, Xalancbmk has 700+ input). It is bit shame for
partition. But at very least, each partition <br>
is under human control. On the other hand, post-IPO
scalar-optimization is not yet parallelizied<br>
in my rudimentary implementation. (i.e. so far only parallelize the
codegen part). Surprisingly, <br>
the result is very consistent with what Xiaofei achieve via
multh-threading code-gen. As far <br>
as I can recall, he speedup some 2.9x. In my case, it take about 13s
before code-gen starts.<br>
Meaning the speedup to the code-gen is about (35-13)/(21-13) =
2.75x. <br>
(Code-gen plus linker's post-processing take 35-13s).<br>
<br>
<div class="moz-cite-prefix">On 8/27/13 12:27 AM, Shuxin Yang wrote:<br>
</div>
<blockquote cite="mid:521C54EB.8090803@gmail.com" type="cite">
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
<div class="moz-cite-prefix">On 8/26/13 11:19 PM, Chandler Carruth
wrote:<br>
</div>
<blockquote
cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"
type="cite">
<div dir="ltr">On Mon, Aug 26, 2013 at 5:53 PM, Shuxin Yang <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:shuxin.llvm@gmail.com" target="_blank"
class="cremed">shuxin.llvm@gmail.com</a>></span> wrote:<br>
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">We
certainly need a way to feed multiple resulting objects
back to linker. There are couple of ways<br>
for this end:<br>
<br>
1) 'ld -r all-resulting-obj-on-disk -o result.o" and
feed the only object file (i.e. the result.o)<br>
back to linker<br>
<br>
2) keep the resulting objects in memory buffer, and
feedback to buffers back to linker<br>
(as proposed by Nick)<br>
<br>
3) As with GNU gold, save the resulting objects on
disk, and feed the these disk files back to linker<br>
one by one.<br>
<br>
I'm big linker nut. I don't know which way work
better. I try to use 1) as a workaround for the time
being<br>
before 2) is available. People at Apple disagree my
engineering approach.<br>
<br>
From compiler's perspective,<br>
o. 1) is not just workaround, 3) is certainly better
than 1).<br>
o. 2) will win if the program being compiled is
small- or medium-sized.<br>
With huge programs, it will be difficult for
compiler to decide when and how to "spill" some stuff<br>
from memory to disk. Folks in Apple iterate and
reiterate we only consider the case that the entire<br>
program can be loaded in memory. So, the added
difficulty for compiler dose not seems to be a<br>
problem for the workload we care about.</blockquote>
<div><br>
</div>
<div>
<div>Shuxin, I'm not sure what you're trying to
accomplish here, but I don't think this is the right
approach.</div>
<div><br>
</div>
<div>First, you seem to be pursuing a partitioning
scheme for parallelizing LTO work despite *no*
consensus that this is the correct approach </div>
</div>
</div>
</div>
</div>
</blockquote>
I sent a proposal long time ago, as far as I can understand from
the mailing list. There is no objection at all. <br>
Actually, but my approach is not new at all. It is almost a "std"
way to perform partition. It looks similar to all LTOs I
worked/played before.<br>
It just need some LLVM flavor. But this change has nothing to do
the partition implementation, it just add a interface. <br>
<br>
<blockquote
cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
<div>in any of the community discussions I can find.
Please don't commit code toward a design that the
community has expressed serious reservations about
without review.</div>
<div><br>
</div>
<div>Second, you are committing a new API to the set of
the stable C APIs that libLTO exposes without a
thorough discussion on the mailing list. </div>
</div>
</div>
</div>
</div>
</blockquote>
Sorry, I thought this is pretty Apple thing, as no other system
use this API. <br>
I will revert tomorrow, and initiate a discussion. <br>
<br>
The APIs are almost divided into two classes. One for Unix+gold,
the other one for OSX + Apple LD.<br>
I don't like the way it is, and I don't like the such APIs at all
(I mean all of them). <br>
I used to argue we are better off having a symbol-related
interface instead of LTO-related API.<br>
But the community dose not buy my point. As I have little
knowledge about LLVM, I have to keep <br>
open mind, and adapter to LLVM-thinking, but it certainly take
some time. <br>
<br>
<blockquote
cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
<div>It is possible I have missed this discussion, but I
did look and failed to find anything that seems to
resemble a review, much less an LGTM. If I have missed
it, I apologize and please direct me at the thread. I
bring this up because the specific interface seems
surprising to me.</div>
<div><br>
</div>
<div>Third, you are justifying the particular approach
with a deflection to some discussion within Apple or
with those developers you work with at Apple. While
this may in fact be the motivation for this patch, the
open source community is often not party to these
discussions. ;] </div>
</div>
</div>
</div>
</div>
</blockquote>
That is true:-)<br>
<br>
<blockquote
cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
<div>It would help us if you would just give the
specific basis rather than referencing a discussion
that we weren't involved with. As it happens, I
suspect I agree with these "Folks in Apple" that it is
useful to specifically optimize for the case that an
entire program fits into memory, bypassing the
filesystem. </div>
</div>
</div>
</div>
</div>
</blockquote>
You bet!. <br>
<br>
I debate with them. No chance to win. Why don't you suspect in the
first place:-). <br>
But "folks in Apple" argue that is plan in the future. It dose
not seems to be pretty lame argument, <br>
as current implement of LTO bring everything in memory. <br>
<br>
| However, there are many paths to that end result. From the
little information in the commit log there isn't really enough to
tell why *this* is the necessary path forward (in fact, I'm
somewhat confident it isn't).<br>
<br>
In concept, there is only one alternative : compile the the merged
module into multiple objects, and feed the object back to linker.
<br>
<br>
<br>
<blockquote
cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<div><br>
</div>
<div><br>
</div>
<div>So, to get back to Eric's original question: what is
the motivation for this API, it's expected actual usage,
and the reason why it is important to stub out in this
way now? </div>
</div>
</div>
</div>
</blockquote>
The motivation is: the existing LTO compile the merged module into
*single* object, <br>
with this new API, it enable the way to compile merged module
into *multiple* objects. <br>
I'm wondering if this is clear now. <br>
<br>
for instance, suppose the command line is "clang -flto a.o b.bc
c.o d.bc" (*.o is real object, and *.bc are bitcode), <br>
existing LTO will merge b.bc and d.dc into t.bc (merged module),
LTO will compile the merged t.bc into t.o, <br>
and feed the t.o back the linker which combine a.o c.o t.o into
a.out. <br>
<br>
The new API will trigger the compiler convert t.o into p1.o and
p2.o ...., and feed these p*.o back to linker, which <br>
combine a.o and c.o into a.out. <br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<blockquote
cite="mid:CAGCO0Khk4+Eoy2pmW5yWUKA4KO3moX9wyvBLGOVHSosoUdD5Nw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Better yet, could we have that discussion before
growing the set of stable APIs that we claim to never
regress?</div>
</div>
</div>
</div>
</blockquote>
<br>
Sure. Sorry about that. I actually don't what to touch the
lto_xxx() API for now. I just want to do some workaround <br>
on the limitation on the linker, and wait for new ld. But Bob
didn't buy my argument:-).<br>
<br>
<br>
</blockquote>
<br>
</body>
</html>