[cfe-commits] [PATCH] Driver reengineering - first patch

Mon Jan 23 00:30:45 PST 2012

Hi Daniel, Chandler,

Thanks for looking at this. In general I completely agree with your
assessment that the patch is way too large. This is partly an artefact of
the way we work and code review internally, partly because I wanted to show
the end goal of the patch series as well as just the steps to get there, and
partly because I didn't think hard enough about how best to split it down
further.

> They way I see it, there are at least a couple major components of
> this work, which can be worked on independently and incrementally.
> .. snip ..
> Once those two parts are workable, then the patches to actually switch
> over to using the DB should be relatively small and easy to understand
> and review.
> 
> What do you think?

I agree, and feel rather stupid for not spotting this course of patches
earlier.

Chandler Carruth wrote:
> Have you coordinated with Michael Spencer about this? He is already
working on many of the same issues,
> and has a more principled approach based on the YAML spec. I would prefer
if we could follow that rather 
> than adding arbitrary extensions to the JSON parsing.

I have not. I assume you're talking about the YAML parser in lld? That might
be worthwhile, I'll look into it.

> I'm starting to work through this patch (and some of your others)

I've only got one other outstanding patch as far as I know?

Cheers and apologies,

James

-----Original Message-----
From: daniel.dunbar at gmail.com [mailto:daniel.dunbar at gmail.com] On Behalf Of
Daniel Dunbar
Sent: 21 January 2012 18:13
To: James Molloy
Cc: cfe-commits at cs.uiuc.edu
Subject: Re: [cfe-commits] [PATCH] Driver reengineering - first patch

Hey James,

Here are my initial comments.

I basically agree with Chandler as to the size and scope of the patch.
I find it very hard to review large monolithic patches. I would love
to see this functionality built up more incrementally.

In general I agree with the rough direction, but have a lot of nits on
the details.

They way I see it, there are at least a couple major components of
this work, which can be worked on independently and incrementally.

(1) Refactoring the existing driver to use data driven configuration
instead of straight C++ code.

This is something which can happen very incrementally. Someone just
needs to work at refactoring the existing driver implementations
(mostly in Tools.cpp) to generalize them and factor the target
dependent bits out into data classes. Those data classes would
eventually come from the JSON database, but there is nothing
preventing the code from being structured that way now.

Pursuing this work independently is good because:
(a) it increases the readability of the driver
(b) it makes it more obvious what the schema for the data model needs to be

(2) We need a DB implementation which handles selecting models based
on command line flags. This is an interesting part of the design, and
has a lot of design choices:
(a) where do DB files live on disk?
(b) do we allow multiple layers of implementations?
(c) what is the selection criteria? how do -arch, -m32, -m64 interact
with -triple (my proposal is to frame them as lookup parameters).

>From my perspective this can be treated as an independent data
structure for a long time with its own set of tests, long before we
get to the point of wiring it up with (1) to make the driver actually
use the DB.

>From a very high level, my basic request for an implementation
strategy would be: refactor Clang to move towards using data driven
models but without changing behavior yet. Those patches should be the
kind of thing we can take into TOT without worry, they will just be
making the code cleaner.

Similarly, by working on the DB as an independent data structure, it
is also fine to take those patches into TOT because they won't be
increasing complexity or dependencies in other places.

Once those two parts are workable, then the patches to actually switch
over to using the DB should be relatively small and easy to understand
and review.

What do you think?

 - Daniel

On Wed, Jan 18, 2012 at 3:26 AM, James Molloy <james.molloy at arm.com> wrote:
> Hi,
>
> The attached patch is the first of the clang driver revamp. This patch has
> no intended functionality change unless you use the new hidden flag
> "-ccc-dynamic-driver".
>
> This patch addresses two points brought up during the driver BoF last
> November, namely
>  * "There should be a 'driver database' of known platforms"
>    * "This should fall back for the unusual case: load from some sort of
> config file (in JSON format)"
>  * "We should be able to ship a compiler that can always act like it's
> targetting Ubuntu".
>
> Firstly, this first patch has some limitations:
>  * It has only been fully implemented for Linux, and only tested (to
> exhaustion) on Ubuntu Natty and RHEL5.
>  * It only deals with what should be the "uncommon" case - loading JSON at
> runtime.
>  * It doesn't yet copy the target DB into the correct place in the build
or
> install dirs.
>
> Secondly, this is what it does do:
>  * llvm.diff adds SMLoc tracking, raw MemoryBuffer input and '// ...'
style
> comment support to the JSON parser. This makes developing the target DB
> files a lot easier, and the comment support is invaluable (and means we
can
> add the copyright header).
>  * Adds an implementation of the Target Database for linux, and minix (as
> an example of a trivial platform). The database parser is completely
> implemented, and a schema is implemented and described that should be
> sufficient for everything we need, and also simple enough to grasp easily.
> This has been at the expense of some terseness in the DB definition.
>
> The input to the target database is a JSON file. This JSON file can
include
> other files. The file consists of a list of lists. Each of those lists is
a
> "group" of objects. When queried, the targetdb will iterate over these
> groups. For each group, all objects are iterated in order. If an object
> "matches" the query criteria, its contents are added to the query result,
> and no more objects are iterated over in that group.
>
> See lib/Driver/TargetDB_Linux.json for an example of how it works. A query
> to the target database will result in a structure containing everything
you
> should need to compile/link/assemble on the target. The format of this
> structure is derived from a tablegen file (the patch includes a new
tablegen
> backend), so there is no boilerplate if you want to add a new struct
member.
>
> The patch has been tested by taking an indicative subset of all possible
> Clang flags (especially interesting ones such as -static, -shared, -pie,
-S
> etc) and brute force running through the entire set of combinations of
these
> flags testing the targetdb-based driver's generated command lines against
> the current one.
>
> The new patch adds a new command flag "-ccc-target-triple" to be
> distinguished from -ccc-host-triple. (When this patch and its followups
are
> accepted and turned on by default, this will change into "-target"). This
> way we can actually (finally!) test "compile targetting X as if you were
on
> Y", which should let us increase our driver test coverage (I also added
> -ccc-linux-distro which allows you to pretend to be a specific distro, as
> this is not encoded in the triple).
>
> Speaking of tests, the submitted patch does not have any yet. I intend to
> submit them in a later patch, as I'd like this to be at least concept
> approved before I spend an infinite amount of time writing a decent test
> set.
>
> Heckling... commence! :-)
>
>
> P.S.: Would people prefer me to commit this now and have post-commit
reviews
> going on? It's quite large so incrementally changing it by review will be
a
> PITA for reviewers to look through each time (and the functionality is
> guarded by flag...)
>
> Cheers,
>
> James
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>