[cfe-dev] my gsoc proposal: llvm/clang distcc

Peter Neumark peter.neumark at gmail.com
Mon Apr 7 07:34:08 PDT 2008


Hi,
here is my proposal:

Synopsys
    The main purpose of this project is implement a clang/llvm based distcc
implementation.
    Distcc means distributed compiler. It can be used as a replacement of
gcc.
    Clang distcc will support ditributed (over network) compilation to any
architecture
    supported by llvm. The llvm/clang distcc main advantage over original
gcc based distcc is
    performace (less memory, less compile time) and customization.
    All these benefit comes from llvm and clang.
    The new discc will be a compiler driver (or frontend) build from clang
and llvm libraries.
    The driver will have two usage mode: gcc option mode and clang option
mode.
    So it can be used as a drop in replacement of gcc, and it will handle
all distribution and cache task.
    There will be also an admin daemon, what will support configuration and
distributes incoming
    requests to nodes. In each node will run a distcc deamon, and will
handle incoming tasks.
    This will do the compilation work.
    The clang distcc will support languages via clang. So currently C and
Objective C will be supported.

Functionality details
    Usage:
        + setup network:
            - start the admin deamon, in a node
            - start distcc deamons in each node
            - register nodes in admin deamon

        + setup local:
            - configure distcc, (setup admin node address)
                This will generate a config file.

        + use: ex: make CC=distcc

Implemenation details
    The development will be done incrementally, from the simplest solution
to more complex.
    The simplest solution is when all source parsing task is done locally
and the built AST
    is distributed to Node for optimization and code generation and it sends
the result back when its done.
    An advanced solution is when a file sharing protocol is used to share
local source files (for including)
    and then parsing is done in Node side and file including is done via the
file sharing protocol.
    A more advanced solution is when we caching built ASTs in a central
database to prevent
    parsing and building each time. This is useful in header files case.
    So there will be these standalone programs:
        + distcc, supports gcc options
        + distcc, supports clang options
        + distcc daemon for Nodes (network is composed from Nodes, what will
do the compilation work)
        + distcc admin daemon (stores information from the distcc Node
network)

    All necessary software components are available in llvm/clang sources,
but network handling.
    So there will be a thin network layer implemented for unix and windows
platforms.
    The new distcc driver will be placed in clang/Driver directory.

    In caching the cached AST identification can be done with a MD3 sum of
the source file including the included
    files MD3 sum and the options used in parsing (defines).

Development methodology
    The work will be done via svn. I'll need a clang branch for my work. But
it is not required a standalone svn
    repository will work too.
    I use ubuntu linux (gutsy gibbon). I'll send a weekly report of project.
    I'll write user and developer documentation (html or pdf).

Project Schedule
    Before the mid time gsoc evaluation the simplest method will be
implemented. The file sharing protocol and
    caching will be done in second part of soc. But it can be figured out in
depth during the first part, when
    the simples solution will ready.

Bio
    I'm a 23 years old student, studying at the Budapest University of
Technology and Economics. I've started programming 7 years
    ago, and I've been using the C language for 6 years, and the C++
language for 5 years. I've been using opensource software for 7 years.
    Compiler programs are one of my passions. I like efficient and clean
solutions. I like nice and clean and well documeted API's,
    like Qt, Ogre3D, llvm, clang. I have stable knowledge of OOP and
software engineering.
    I like much reusable and clean, easy to understand code.
    I'm familiar with the following programming languages:
        - C (6 years)
        - C++ (5 years)
        - python (3 years)
        - java (4 years)
        - SML (1 year)
        - Prolog (1 year)
        - lua, squirrel
        - haskell (actual passion)
    I'm tracking llvm and clang development since last gsoc, beacuse I've
recognised llvm in gsoc projects list.
    I have an svn copy of llvm and clang since 2007 october. I always
compile it. I've readen all docs avalable from llvm and clang.
    I also know the source code structure and its functionality.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20080407/a412c35a/attachment.html>


More information about the cfe-dev mailing list