[cfe-dev] First shot at Bug 4127 - The clang driver should support cross compilation

Ruben Van Boxem vanboxem.ruben at gmail.com
Tue Jan 10 03:32:15 PST 2012


Op 10 jan. 2012 11:50 schreef "James Molloy" <James.Molloy at arm.com> het
volgende:
>
> > It would be really nice to have smaller patches rather than larger
patches, and earlier discussion of them.
> >
> > Again, I remain very concerned about doing lots of work around
configuration files to configure a *broken* driver design. I think we'll
just end up > with broken config file designs as well, and we'll
simultaneously make it that much harder to refactor and change the driver
in the future.
> >
> > I am still pushing to see refactoring and design work on the *existing*
use cases the driver supports before extending the use cases. I don't know
how to support cross compilation for more and more diverse platforms prior
to getting cross compilation for very basic platforms, or even non-cross
compilation into a better state.
>
> > Consider that system header search logic for Darwin, MinGW, and Cygwin
is still largely implemented in the Frontend rather than the Driver. This
is something I'm actively working on for reference...
>
> Indeed, and this is why it has taken so long since the conference to get
something solid (apart from other work). Not only do we need to work out an
end goal but make it achievable in small reviewable steps.
>
> I've studied the internals of the driver and thought of several different
ways of factoring it, including a composable "pass" framework where
arguments get successively modified by composable passes. That had some
promise, as the baked in behaviour could be completely controlled at
runtime easily, but was a no go because, and this is the major part:
>
> I can't find a way of validating that a gigantic refactor makes no
functional change in the driver. The regression tests aren't sufficient,
and I'm likely to break Darwin or some other target with a huge refactor -
Tools.cpp for example contains 5200 lines, some of which are common and
others not.
>
> Add to that the fact that after many iterations I still come back to the
current driver design as "not broken". There's nothing wrong with the
concept of Tools and ToolChains - in fact as an abstraction they suit
reality well.
>
> The main thing I see being the problem is the use of subclassing to
parameterise the Tool classes. Because they weren't designed for
parameterisation to start with, people have also copypasta'd huge chunks of
code around. There are at least 5 different functions that can driver "ld"
or "as", for example, each subtly different because one or two have had
bugfixes, some have trashed behaviour they don't support, etc etc.
>
> So here's my general "vision":
>
>  * A subclass of Tool will relate solely to the command it is
driving/producing, not OS/Arch specific configuration thereof. For example,
"binutils::As", "binutils::Ld", "gcc::Compile", "gcc::Link",
"gcc::Assemble", "visualstudio::Link".
>   * These tools will have a parameter "std::vector<std::string>
ExtraArgs", which is a list of extra arguments to give to the tool. This
will be created elsewhere.
>   * I have yet to work out where Darwin will fit here - ideally I'd like
to have Darwin do all its funky logic and stick it all in ExtraArgs then go
independent from there, but I don't know the best solution.
>
>  * A target should be able to select any tool for any JobAction. This
makes hard-baked ToolChains superfluous. You shouldn't have to subclass
ToolChain for your target, because it will be dynamically generated by...
>
>  * The "target database". I think this should be able to parameterise the
Tools in any way required - all OS-specific stuff (With the exception of
Darwin - that probably requires too much imperative code) should be in the
DB.
>  * This can take two forms - hard-baked and JSON. The hard-baked version
I see being a tablegen file similar (as possible) to the JSON
representation, which is compiled into Clang for speed.
>  * This way, we keep the speed and extensibility and channel them both
through the same interface, so that anything you can do hard-coded you can
also change at runtime.
>
>
> So here's my migration plan:
>
>  1. The target database is where all the current imperative configuration
should be factored out to. Create an initial draft schema, a
ToolChain/HostInfo that uses it. At this point I suggest only using JSON as
this will be easier to change should the schema change than a tablegen
backend. The tablegen backend can be added later and the JSON data ported
over for speed.
>  2. Create the first of the "properly independent" Tools - binutils::Ld
and binutils::As, and use the target database to parameterise them.

This will lead to clang using ld to link directly, removing dependency on
gcc to link, which is a good thing. This might require some form of
versioning for newer features being added to newer ld versions.

>    * Probably first patch checkin point? Use a new driver debug flag to
enable the new behaviour -ccc-dynamic-driver.
>  3. Port more ToolChains to the target database. For linux, we'd need to
keep the distro detection logic outside the targetdb, but then we shouldn't
need clever header detection methods as we can bake the expected header
locations for a given distro into the target database.

This would keep every Linux distro and version in the clang codebase. Is it
not preferred to have at least header search dirs moved out of the code, to
some configure flag or conf file? True Linux cross-compilation may not want
to use native system headers, but some specific to a target.

>  4. Sort out what we're doing with Darwin. Is it having its own set of
Tools and living in its own domain, or is it linked to the independent
tools?
>  5. The Big Switchover, at which point we can remove ideally around 4000
lines in Tools.cpp and 90% of ToolChains.cpp (probably also HostInfo.cpp)
and end up with a driver which is centrally configurable both at compile
and runtime.

Don't forget initheadersearch,  that's where most of the cruft is located.

These are just my two cents...

Ruben

>
>
> OK, so there's a full braindump. I was going to throw this up for
discussion in true LLVM style - "with a patch" - in a week or so but my
hand has been pushed ;)
>
> Note that this doesn't address the parsing logic disparity between Driver
and Frontend - that's not my aim. I'm hoping to "fix the driver for
cross-compilation", not fix the entire driver. I'm hoping someone else
might chip in there!
>
> Let the heckling commence! ;)
>
> Cheers,
>
> James
>
>
>
> From: Chandler Carruth [mailto:chandlerc at google.com]
> Sent: 10 January 2012 09:24
> To: James Molloy
> Cc: Sebastian Pop; cfe-dev at cs.uiuc.edu Developers;
clang-commits at cs.uiuc.edu
> Subject: Re: [cfe-dev] First shot at Bug 4127 - The clang driver should
support cross compilation
>
> On Tue, Jan 10, 2012 at 12:25 AM, James Molloy <James.Molloy at arm.com>
wrote:
> As I say, I'm working on a patch that I think is a superset of yours and
would conflict massively. I've been planning it for some time and think I
have a viable end goal and route to get there.
>
> It would be really nice to have smaller patches rather than larger
patches, and earlier discussion of them.
>
> Again, I remain very concerned about doing lots of work around
configuration files to configure a *broken* driver design. I think we'll
just end up with broken config file designs as well, and we'll
simultaneously make it that much harder to refactor and change the driver
in the future.
>
> I am still pushing to see refactoring and design work on the *existing*
use cases the driver supports before extending the use cases. I don't know
how to support cross compilation for more and more diverse platforms prior
to getting cross compilation for very basic platforms, or even non-cross
compilation into a better state.
>
> Consider that system header search logic for Darwin, MinGW, and Cygwin is
still largely implemented in the Frontend rather than the Driver. This is
something I'm actively working on for reference...
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy the
information in any medium.  Thank you.
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120110/3f354f15/attachment.html>


More information about the cfe-dev mailing list