[llvm] r188188 - Misc enhancements to LTO:

Tue Aug 13 17:32:13 PDT 2013

On 12 August 2013 16:41, Shuxin Yang <shuxin.llvm at gmail.com> wrote:

>  Thank you very much for sharing you concerns.  I read this mail
> carefully, it seems we had little miscommunications.
> I hope I clarify in this mail:-). See the interleaving text.
>
>
> On 8/12/13 3:41 PM, Nick Lewycky wrote:
>
>
>
>>
>>
>>   >which in turn drives libLTO through the API.
>>>
>>>  Depending on the what kind of info "something" else need to drive the
>>> libLTO.
>>> In general it is very bad idea, if "something else" need
>>> micro-management.
>>>
>>
>>  libLTO is part of the linker that uses it.
>>
>>
>>  No! Absolutely not!
>>
>
>  Fair enough. I meant "libLTO is part of the linker that uses it" in the
> same sense that a networking library is part of the web browser that uses
> it. The library shouldn't be off deciding to do things of its own accord,
> it should provide an API that allows something else to accomplish its task.
>
> I don't think such comparison is precise.  For app with networking lib,
> the "braid" reside at app side because the app define the behavior.
> For linker+libLTO, I believe the "brain" should reside at libLTO side, as
> it is far more complicate (although our current implement is bit simple).
>

I'm afraid I don't agree. We can agree to disagree on this point, it isn't
necessary for us to agree here to make forwards progress.

>From my point of view, the program that the user calls is ld. That ld is
responsible for fulfilling its promises to the user, who does not know or
care what libraries ld is using under the hood. We've taken the direction
that we'll add new features to libLTO lazily, when there's a demand from a
linker that wants them, but that shouldn't be confused for deliberately
hiding functionality from the linker.

And it is confusing because there *are* things that libLTO is deliberately
hiding: all the various changes in LLVM's C++ API.

I believe GNU gold is a good in designing the interface.  In the case of
> gold+libLTO, the "brain" is embodied by "tool/gold/*.cpp."
>

There's very little logic in tools/gold. It's either in llvm proper, or in
gold proper. Both libLTO and LLVMgold are thin wrappers.

Anybody can change to whatever he/she like. Although it dose call APIs, it
> dose not have to. I can directly call those c++ code
> dancing behind the API.  This is the "stable API" I'm talking about.
>

Okay. As a matter of terminology, I've been using "stable" to mean
"unchanging", synonymous with ABI locked or ABI fixed.

>  I don't see this as a very bad idea or as micro-management.
>
>
> I didn't see that either until I start implementing stuff.
>
>   In this regard, I don't see a difference between libLTO and other
> libraries like libPNG or netlib or freetype. (There is a difference in that
> we want libLTO to be a very high-level interface instead of exposing the
> details of .bc files, entirely unlike what libPNG does for PNG or freetype
> does for font files.)
>
>
> Similar in concept. Concept only.
>
>       Having a default setting with the ability to override it is a
>> sensible convenience for users of libLTO.
>>
>>
>>
>>
>>  Take Apple ld as example,  if I want to change LTO in a way such that I
>>> don't want to load all module,
>>> I just want to load summary info. Current APIs are not sufficient. I
>>> have to modify the API, or add new APIs
>>> to that matter, in the mean time, I need release the new ld to the user
>>> in order to accomodate the change.
>>> that is nightmare.
>>>
>>
>>  The point of libLTO is to provide an ABI-fixed library, isolating the
>> linker from llvm's internals.
>>
>>  It is not "fixed", it is changing constantly.
>>
>
>  The only reason libLTO exists at all is to give the linker something to
> link against which will have a fixed ABI. Same with "libclang" on the clang
> side.
>
> It is slightly different in the case of libLTO+linker.
>
> clang + libclang are tightly bound in terms of release. If you are not
> happy with particular API, you can change to whatever you feel more
> comfortable.
>
> The linker is usually release independently. The compiler has less
> control, if the API between compiler and linker is constantly changing.
> Keeping backward compatability is a big problem.
>

Right. libLTO is supposed to be that solution, by giving the linker a
stable ABI (and API) to link against, and handling whatever changes in the
LLVM C++ API behind the scenes. Put another way, a linker built against
libLTO version X should work with version Y for all Y>X forever, without
even a recompile (assuming the linker was built to load libLTO dynamically).

>  Thus far it's LLVM policy that the *whole and entire* C API is ABI-fixed
> forever, and I've argued a few times on the mailing list that this can't be
> right, and that only libLTO and libClang ought to be ABI locked.
>
>   E.g. the APIs used to take for granted the libLTO return only one
>> objects,
>> now I need to return multiple.
>>
>
>  Yes, and that's a problem. Not your problem really, except to the degree
> that you inherited it. The existing APIs in libLTO weren't nearly
> forwards-compatible enough, and now we're in trouble.
>
>  Unless we figure out something clever, we may have to add a whole new
> set of functions to libLTO, and not deprecate the existing ones (at least,
> not unless we get consensus on llvm-dev that it's okay to break our
> previous ABI promise).
>
>>    That in turn leads to a few design decisions. The API is designed to
>> refer to high-level concepts instead of the details of llvm's actual
>> behaviour. Things like module lazy loading or setting the datalayout are
>> excluded from the API. Flags are even more private, surely we should be
>> able to change flags in LLVM's libraries without worrying about breaking
>> linkers.
>>
>>  If the linker needs to do something where it matters how llvm is
>> implemented -- you mention loading summary info, I'll assume you mean
>> lazy-loading the module such that function bodies aren't loaded -- then the
>> linker doesn't use libLTO at all, but uses llvm directly. Conversely,
>> libLTO knows all about llvm and will lazy-load .bc files without being
>> asked to.
>>
>>  Sure, "something else" can control the libLTO, if it want. In my case,
>>> if "something else" want specify
>>>  a workdir, then go ahead. Otherwise, the libLTO use default one. Is
>>> there any wrong here?
>>>
>>
>>  At a high level that sounds fine to me. The wrong part is using flags
>> to do it.
>>
>>  then how to change the behavior for say, debugging purpose.
>>
>
>  Debugging is special. In theory, you don't even need to commit to
> upstream for debugging, but it's fine to add features that are helpful. We
> have that sort of thin all over llvm. libLTO has addDebugOptions to permit
> this sort of debugging usage, but it shouldn't be used in the non-debugging
> case.
>
>
> Passing flags LTO is annoying and it is a sort of high-tech.
> Bill's attribute-stuff is a way to pass some flags down the roads.
>
> How about passing -O3 -floop-vecotroize to make LTO and post-LTO code
> works. (The opt-level is -O2, had vectorize-flags is off by default).
>

Right, this is a great example. I would argue that we absolutely should not
offer such control to libLTO, not by flags or environment variables or C
API or anything. Why? Because it locks the entirety of LLVM to offering a
"loop vectorizer" forever. What happens in LLVM 4.0 when we have a "global
vectorizer" or "common vectorizer" or some other name? We've removed
optimizations before (global common subexpression elimination, gvnpre, ...)
and we should have the freedom to continue to do so.

Now, suppose along comes a linker vendor who says "as a feature, my users
can specify -floop-vectorize or other flags to control which optimizations
get run at link time". Do we refuse? Do we tell them it's okay, but we
reserve the right to break these flags at any time (and then what, we can't
catch typos because they could just be names of old optz'ns we don't
support)? But I actually don't think giving control of this is a *feature*
-- it can only really used as a workaround for bugs (or exotic stuff like
kernel code where the vector register unit hasn't been initialized yet, but
I am okay with having a flag to control whether we're allowed to use a
certain class of instructions -- yet not okay with disabling individual
optimizations).

We may (likely) have better way in the future to pass these flags to LTO,
> but we have to pass the these flags the it to make the existing code work,
> at least for the time being.
>
>
>          Adding flags to linker instead, I think that is wrong direction.
>>>> Linker dose not have data structure which libLTO dose.
>>>
>>>
>>>  This is the discussion to have. What things do you need here which you
>>> don't think should be exposed through the API, and yet you want to be
>>> exposed for you?
>>>
>>>  I actually discuss with Nick @ Apple before.  The conclusion is linker
>>> must be LTO oblivious,
>>> it should think in symbol-way, and talk in symbol way (as with GNU
>>> gold). It would otherwise
>>>  very very troublesome both for linker and libLTO.
>>>
>>
>>  And now you're discussing it with me. I also agree that the linker
>> should communicate primarily in symbols and about symbols with libLTO.
>>
>>  On the other hand, we now have two linkers support LTO. There are
>>> different way to control
>>> the libLTO (even for simple task, like save intermediate files), how
>>> messy?
>>>
>>> I'd like to move all these stuff to libLTO to have a unified control.
>>>
>>
>>  I have no problem with a unified control.
>>
>>>     libLTO is intended to be used as a library, it may not get a chance
>>>> to parse flags.
>>>>  It has to. Prior to my change, linkers (GNU linker and Apple ld) pass
>>>> arch to linker, via a function
>>>> confusingly called, something like "add.*debug.*options".
>>>
>>>
>>>  Can't. If we allow this, every flag in every part of LLVM that libLTO
>>> links against is baked into the C ABI forever.
>>>
>>>  Of course addDebugOptions does allow this, but it's named (and I
>>> thought documented in the comments) such that anybody using it knows
>>> they're using a non-stable non-production debugging API. Anybody using
>>> addDebugOptions for something other than debugging libLTO is living outside
>>> the ABI guarantees.
>>>
>>>  addDebugOptions is misnomer. It is also passes essential flags like
>>> -arch=x86.  Without such flags,
>>> the LTO dose not even compile.
>>>
>>
>>  That sounds like a nice bug you've got there! Wouldn't want anything to
>> happen to it. It'd be a shame if breaks before you manage to add a
>> liblto_set_arch() function for it.
>>
>>
>>   * Honestly, I looked and couldn't find a -arch flag that libLTO would
>> interpret. How sure are you about this?
>>
>>  Perhaps not -arch flags.
>> But at least some flags are passed this way.  I remember we use this way
>> to pass -fast-math before Bill's attribute-stuff is working.
>>
>>
>>   In case it isn't completely clear, flags are absolutely right out.
>> Either you will revert this patch, or I will revert it for you.
>>
>>  I have no alternative.  If I introduce a workdir, I need to have to way
>> to inform linker-plugin to get rid of way.
>> This is another example why those API sucks.
>>
>
>  You don't have the source code to the linker?
>
> I can modify linker source code. The problem is how to make sure all users
> get the modified linker to work with the new compilers.
> It going to be very messy. right?
>

True. You have a deployment problem where instead of shipping just a new
libLTO, you ship a new libLTO (and all older linkers must continue to work
with it), and then ship a new linker taking advantage of the new libLTO
APIs. Sorry, but I think this is a natural consequence of the fact that
libLTO needs to be ABI-locked.

(Also, in reality, if you can solve the deployment problem for libLTO, then
you can solve the deployment problem for libLTO+ld. Yes it'll be more work.)

Unlike the clang and clanglib, they are so "close" in terms of release.  We
> can change at will.
>
>
>  Let's focus on this, it sounds like this is the key problem. What's
> wrong with modifying the linker if you want to change the behaviour of your
> linker?
>
> How often dose a user check if the linker is up-to-minute?
>
>
>
>       I'm sorry you decided to land three things together in one patch,
>> please remember not to do that in the future.
>>
>>    Ok, tell me how to create temp workding directory right. How to save
>> temp files right both for gold and Apple ld.
>>
>
>  *Why*? Are you implementing this as a linker feature you intend to ship
> in the real linker? Or is this to debug the innards of libLTO?
>
>
> It is not linker's feature, it is absolutely libLTO's own biz. Creating a
> workdir is neat way to organize intermediate files,
> we can certainly use a messy way to organize the intermediate files
> without creating workdir.
>
>
>  The only case I *am* okay with flags is when we all agree they're flags
> for debugging the internals of libLTO,
>
> To large extend, it is for trouble-shooting purpose.
>
>   and that we don't ship products that rely on them.
>
> The product will not rely on it.
>

Okay. Got it. So I have a few thoughts on this.

First of all, why don't we expose the fact that we produce native .o files?
Because we don't want to necessarily require files. lto_codegen_compile
returns a pointer to the memory containing the file in memory. Hilariously
this means the libLTO writes out to disk, loads it into memory, hands it
off to LLVMgold which writes it back down to disk again and hands it off to
gold, which reads it back into memory. And yet it's still the right
interface. libLTO should stop writing to disk and actually produce .o in
RAM directly, and gold should learn to read from RAM directly.*

Secondly, if we aren't exposing the fact that we produce native .o files,
should we be exposing a knob that lets us control the working directory?
Probably not, but it's not unreasonable. If we're going to write to disk
it's polite to let the caller choose where. If we don't write to disk, the
API can be trivially implemented by doing nothing.

You suggest $PWD/unique-tmp-workdir instead of /tmp. Consider $TMP, or on
Windows $TEMP? I don't mind if we're smart enough to pick good defaults,
but I can absolutely imagine a linker that to keep its temp files in a
specific directory. Suppose a mobile OS that runs the linker on the phone,
where they have strict disk quotas. It's important to put the files in the
right places, so they get counted against the right quotas. Also, for
cleanup in the event of a crash (assume it wipes the whole directory tree).
I really think lto_set_tempdir would be a good API to have in libLTO, and
poses no risk of being unimplementable in the future.

Thirdly, I'm not convinced that lto_codegen_get_files_need_remove needs to
exist. Why not do the file deletion in lto_codegen_dispose?

* Actually, Rafael added lto_codegen_compile_to_file in r128108 and we've
had numerous LLVM release since then. Now we have to support writing to
files forever; even if we support writing to memory, we can't remove the
writing-to-file path. Fortunately we can continue to implement this API in
the future by codegen'ing to memory plus a small amount of code to write to
disk. Still, I'm a little bit sad inside.

> I explicitly called that out. If the only purpose of these was to
> implement debugging features, then I'm sorry for the miscommunication!
>
>  If the intention is to let libLTO run on machines that don't have /tmp
> (this is what I thought), we should give libLTO an API that lets the linker
> decide where the files go.
>
> I choose $PWD/unique-tmp-workdir instead of /tmp/xxx.
> I should try /tmp/xxx if $PWD/unique-tmp-workdir dose not work.
>
>   Maybe it wants to do smart things like putting it in a directory with
> the right permissions, or which is scheduled for cleanup in the event of a
> crash.
>
> That is in TODO list. I try to install the sig handler, but the supporting
> routines ignore directory (it only delete regular file on signal).
>

Hold on, what if the linker installs its own signal handler? If
lto_codegen_dispose/lto_module_dispose aren't safe to call during a signal
handler, what do you think about providing a signal-safe-emergency-shutdown
API to libLTO? It should only be as hard to implement as factoring out the
code you were going to write anyways.

Nick

> If the intention is to implement something like -r, or to implement some
> new feature like -shared but which emits a shared-bitcode file instead of a
> shared-object,
>
> I'm not sure if bit-code can be encapsulated in *.so (*.a sure work), and
> if it's good practice to do so.
> At least, I'm not happy to see a *.so is not binary, the app will sure
> complains.
>
>
>
>    then I don't think that controlling the working directory then pulling
> out intermediate files is the right design anyways.
>
>
>
>
>  Or are you doing something else?
>
>
> I guess I'm doing something else.  Suppose you are building libmy.so from
> a.o b.o. c.o d.o, where c.o d.o is bit-code.
> the LTO manage to compile c.o+d.o into t.o. Or if the c.o + d.o is huge,
> the LTO split them into several partitions,
> the compile partitions in to t1.o t2.o ... tn.o.
>
> The intermediate files I'm talking about are those t*.o. libLTO hand these
> intermediate files to linker, *NOT* know they
> they will die. It is up to linker to get rid of them.
>
> After linker get these intermediate files, and move on the linker the
> final libmy.so. Before exit, it remove all these intermediate files.
>
> As you can see, we certainly can live without workdir. It is just a neat
> way to organize them, and also obviate the need to
> create unique name for each intermediate file to avoid name conflict. This
> is especially handy if we want to compare
> performances etc.
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130813/7f7e53b2/attachment.html>