[cfe-dev] final distributed clang patch
kremenek at apple.com
Tue Jul 8 16:27:10 PDT 2008
I agree with the comments that Eli and Chris made; the code
duplication is something we want to avoid. Eli brought up an
excellent point that key pieces of the driver should be factored off
to a separate library, and I too have felt this way for some time. I
think that even resolving all the various preprocessor and compiler
options (e.g., -I, -D, etc.) that is needed to instantiate a
Preprocessor should also be factored out of clang.cpp into a separate
I also agree with Chris's comments that separating the "distcc" driver
from a regular clang driver is a good idea. That keeps the distcc
implementation simpler, and potentially allows it to be used with
multiple compilers (not just clang). I myself was fine with
integrating the distcc support directly into the clang driver for a
first pass, but because the distcc driver will not use all of the same
functionality as the regular clang driver (and obviously do a few
things that the regular clang driver does not), the better long term
approach is to factor key components of the clang driver into
libraries, make clang and distcc-clang separate executables, and
simplify the logic for both.
One thing that hasn't emerged in this discussion is whether or not the
clang distcc should interoperate with the traditional distcc
implementation, or (a different but related issue) is whether we
should require that the compiler itself be clang. One advantage of a
clang-based distcc, independent of using clang to perform compilation,
is that clang-distcc can do the source preprocessing itself without
forking off a separate process (which is what the traditional distcc
implementation does). This seems more like a good step one: build a
distcc client that just takes care of preprocessing in-process, and
see what kind of speedups you get over forking and preprocessing.
Ultimately we're interested in speed and scalability, and small steps
like these help guide the design.
Interoperability with other compilers doesn't mean we should limit the
design of clang-distcc. We can certainly implement special
functionality when multiple compiler "workers" are based on clang
(e.g., serializing ASTs, special caching, etc.).
I like the concept of the NetSession class, although the issue of
interoperability with existing distcc implementations is something
that is worth discussing. Chris is right that the system-specific
APIs, such as the use of sockets, should not be in header files. A
PIMPL approach, like what we use for FileManager, would probably work
well (where the system-specific stuff only appears in the .cpp file).
As for the clang server, both pthreads and sockets are system-specific
APIs. We'll want a design that keeps the threading modeling separate
from the code that processes a unit of work. This will allow us to
tailor the implementation to use the best parallel computing
primitives that are available on a specific architecture.
I'm also a little confused with the overall design. It looks like a
client (a 'clang' process) connects to a server, sends the
preprocessed source to the server, waits for the server to chew on the
file, gets the processed output from the server, and then writes the
output to disk. It appears that the client attempts to connect to
different servers in a serial fashion, and then picks the first
available server. Is this how traditional distcc works? (I actually
don't know) It's a simple design, but it doesn't amend itself well to
good load balancing as well as reducing the latencies in firing off
compilation jobs (a bunch of connection attempts in serial fashion
seems potentially disastrous for performance). This particular point
isn't a criticism of your patch; what's there is fine to get things
started. I'm not a distributed computing expert, but something akin
to the Google MapReduce system (which has workers and controllers)
seems more flexible for fault tolerance, load balancing, and so
forth. This is certainly something worth discussing in a higher-level
discussion of the overall design of the system.
A few comments inline.
On Jul 7, 2008, at 9:13 AM, Peter Neumark wrote:
> Here is the final patch for clang to support network distributed
> compilation. (clang.patch file)
> There is also the server part attached. (the tar.gz file)
Like the client, the server shouldn't have so much code copied from
the Driver, and it certainly doesn't need to use all of the
ASTConsumers in the regular Clang driver. General work (by anyone who
is interested) on modularizing the driver will help make this much
> There 3 new files added to Driver directory:
> PrintPreprocessedOutputBuffer.cpp what is a modification of
> PrintPreprocessedOutput.cpp to support print text to a std::ostream.
I'm not certain why a separate version of PrintPreprocessedOutput was
necessary. iostreams are slow, and writing to sockets using the FILE*
abstraction is perfectly acceptable (via fdopen()).
> Other new files: NetSession.h and NetSession.cpp which handles and
> contains all networking code (portable thin networking code).
> There are some files changed, mostly to support saving its output to
> a std::stream. I've used that way to pass clang ASTConsumers data to
> an other computer via network.
> There are 3 new option added to clang. The basic one is -distribute
> what enables distributed compilation. The other two are: -dist-
> preprocesslocally and -dist-serializelocally.
> If the first one enabled then clang sends a preprocessed file for
> clangserver (a process in an other machine) to compile. In the
> second case the lexing and parsing is done locally and the built and
> serialized AST is sent to clangserver.
> You can play with this using -dist-preprocesslocally because it is
Overall, I think this a good initial start! I think that next logical
steps would be to look at both overall design as well as issues of
code structure (addressing the comments on modularity, isolating
various implementation details, etc.). Getting a few interesting
performance timings would also be extremely useful to help shape some
of those design decisions.
Incidentally, how well does the code work when the two processes
(client and server) are on actually two different machines? Right
now, the client always connects to "localhost". Getting performance
timings when both client and server are on the same and different
machines is also interesting to see how much things like network
latency, etc., are a factor in the design. There may also be some
correctness issues that are masked by having the client on server on
the same machine.
More information about the cfe-dev