[LLVMdev] LTO, Code Generation Options, etc
Eric Christopher
echristo at gmail.com
Wed Apr 1 23:16:21 PDT 2015
On Tue, Mar 31, 2015 at 8:59 PM Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
>
> > On 2015 Mar 30, at 10:11, Eric Christopher <echristo at gmail.com> wrote:
> >
> >
> >
> > On Mon, Mar 30, 2015 at 9:52 AM Eric Christopher <echristo at gmail.com>
> wrote:
> > From PR18808 I said a few things and that I was going to redirect to the
> mailing list for further discussion. So here we are, go.
> >
> > 1) Whether or not to allow changing of target-cpu/target-feature/triple
> at link time code generation.
> >
> > - Not convinced here of the facility to do so. Could just recompile the
> individual bitcode files to get what you want, but there are some users
> that are trying to ship bitcode (as crazy as that sounds).
>
> IMO, it's cleanest of the target-cpu/target-feature/etc. are set at
> compile time. That's where users are accustomed to specifying codegen
> options already, and besides: the frontend needs to know the backend in
> order to conform to the ABI, set macros, emit calls to target-specific
> intrinsics, etc.
>
> I'll send a review of r233227 in a moment to that effect ;).
>
> > 2) How to pass other sorts of options to the backend for code generation
> >
> > - -ffoo options -fno-foo options. I.e. -fno-inline, etc. I think this is
> really pretty important from the user POV. It affects things at a more
> global level.
>
> This is easy to solve for -fno-inline in particular: we should just
> add a function attribute (`noinline`?) that the inliner should treat
> as a synonym for `optnone`. Any functions that come from translation
> units compiled with `-fno-inline` get ignored by the inliner; functions
> from other translation units participate fully.
>
>
This is pretty terrible as you allude to here, this is a hack for
-fno-inline, but it's also not good for "I'd like to inline at the
individual translation unit compile time, but not at the LTO time."
> But in terms of setting up the LTO pass pipeline, some level of user
> customization makes sense. I'm not really sure how much is useful. We
> have a start at that with Peter's recent commits to add -O0/-O1/-O2
> (not that anyone thought too carefully about what's happening at those
> optimization levels).
>
Yeah, I commented pretty heavily on that thread if you'll remember. It's a
hackish workaround for the moment, but will work in the short term.
>
> > 3) The llvm developer debugging story
> >
> > - It's useful for llvm developers to be able to more accurately debug a
> set of IR using bisection or being able to turn off code generation
> options. Should this be done at the command level (i.e. infrastructure that
> clang and llc etc could even share), or should it be done at an llvm IR
> rewriting level? Don't know. I kind of want a rewriter, but I'm not wedded
> to any particular answer.
>
> I think some sort of rewriter makes sense.
>
> Long-term I'd still like to encode whether an option is overridable in
> a sane way (via a default attribute sets or something), but I haven't
> had time yet to go back to my original proposal and refine it :(.
>
>
If I had any bright ideas I'd have said something. :)
> >
> > That said I was actually envisioning something like:
> >
> > clang -emit-llvm foo.c -o foo.bc
> > ...
> >
> > clang -O3 -flto all.bc -arch x86_64h -o haswell_slice
> > clang -O3 -flto all.bc -arch x86_64 -o x86_64_slice
> >
> > for the same set of bitcode files. But given the front end language
> restrictions on doing anything actually interesting there it's not too much
> of a constraint.
>
> Many of the differences between architectures CPUs affect preprocesser
> definitions, right? Link-time is too late for the frontend to emit
> Haswell-specific intrinsics, for example.
>
>
*nod* But useful for making code generation decisions (vectorization etc).
> That said, it would be cool if this worked.
>
>
Yep, which leads us to:
> > Another usage is the (admittedly one I don't think we want to support)
> halide one that I discovered this week:
> >
> > clang foo.c -emit-llvm foo.bc
> > clang -target aarch64-linux-gnu foo.bc -O3 -o foo.aarch64
> > clang -target x86_64-linux-gnu foo.bc -O3 -o foo.x86_64
> > ...
>
> Whereas this is just insane :0.
>
Sorta...
>
> >
> > I've since convinced them to use the pnacl sort of thing for more target
> independent code generation at the moment. It's a use case that could be
> thought about more though - especially as pnacl does the exact same sort of
> thing, just with a different triple for actual link time code generation,
> it looks more like:
> >
> > clang -target le64-unknown-unknown -emit-llvm foo.c -o foo.bc
> > clang -target aarch64-linux-gnu foo.bc -O3 -o foo.aarch64
> > clang -target x86_64-linux-gnu foo.bc -O3 -o foo.x86_64
>
This is probably a bit more sane, i.e. a generic situation. PNaCl has been
using this exact use case for quite a while now and, IIUC it's also the
basis of the new Khronos proposal. It'd be nice to support this sort of
thing in some fashion (i.e. make restrictions), but I think at this point
telling them their behavior isn't allowed would be a little mean :)
-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150402/5df6d7cf/attachment.html>
More information about the llvm-dev
mailing list