[cfe-dev] Question about clang distcc

琬菁楊 ching1119.cs96 at g2.nctu.edu.tw
Sun Jun 17 08:36:50 PDT 2012


Hello,

I noticed this project at open project web page and read previous posted
thread.I am
a master degree student and want to do a project related to LLVM/Clang as
my master
thesis. My goal is not just accomplishing my thesis but also contributing
to LLVM
community.

I am curious about the following:

I have tried distcc and find distcc works well with clang. However distcc
use gcc as
an external program. It is inefficient for Clang because Clang is
library-based. Is this the
reason to develop a new distcc?

Ching

2012/6/17 Lyu Mitnick <mitnick.lyu at gmail.com>

> On Sat, Jun 16, 2012 at 9:14 PM, Manuel Klimek <klimek at google.com> wrote:
> >
> > On Sat, Jun 16, 2012 at 8:36 PM, Matthieu Monrocq <
> matthieu.monrocq at gmail.com> wrote:
> >>
> >>
> >>
> >> On Sat, Jun 16, 2012 at 3:19 PM, Lyu Mitnick <mitnick.lyu at gmail.com>
> wrote:
> >>>
> >>> Hello Douglas,
> >>>
> >>> I have read all posted carefully. According to the discussion, what we
> >>> can do better than
> >>> original distcc are as follows:
> >>>
> >>> 1) The intermediate files passed over the network would be serialized
> AST
> >>> 2) The intermediate files passed over the network would be LLVM IR
> >>> 3) Centralized admin daemon
> >>> 4) Use PCH
> >>>
> >>> To improve the issues above. We can extend the original distcc.
> >>> However Chris Lattner
> >>> mentioned the first mile-stone of clang distributed build project is
> >>> re-implementing a new
> >>> distcc.
> >>>
> >>> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-April/001357.html
> >>>
> >>> I am wondering to know why and what is the desired design of clang
> >>> distributed build
> >>> project for cfe community.
> >>>
> >>> Mitnick
> >>>
> >>> >
> >>> > No, this project has not been accomplished. I don't think there was
> any real progress on it since that discussion.
> >>> >
> >>> >        - Doug
> >>>
> >>
> >> We have been using distcc (on top of gcc) for a while at work, and it
> does work, somewhat, but it has a big issue: preprocessing is a huge part
> of compilation, not being able to get rid of it creates a bottleneck that
> inhibits true scalability. Given that we are using 24 cores servers, we
> could push to about 40/60 parallel compilations (interesting when an error
> occurs in a header and the local machine has to compile those 40/60 files
> locally to display the errors); any further and we would not observe any
> significant progress in compilation time: the local machine became the
> bottleneck.
> >>
> >>
> >> We experimented for a time with a solution where we streamed the raw
> unprocessed files and had a filtering in place to only push "local" files
> and have replica on the distcc hosts for the 3rd party headers. It worked
> quite well, not much gain but slightly faster... as long as the distcc host
> had the up-to-date collection of 3rd party headers & the local directory
> hierarchy was similar; had a few issues with it (maintenance) so we fell
> back to the traditional distcc.
> >>
> >> Honestly, we got much more performance boost from ccache than from
> distcc.
> >>
> >> I am unsure how to work around the local preprocessing issue, and I am
> afraid that no significant progress will be made as long as it stands in
> the way. I would be glad to hear some folks have ideas to get around it,
> these days I put more hope in a "persistent" process that would cache
> various stages of the compilation pipeline (maybe using the daemon Chandler
> was talking about ?).
> >
> >
> > clangd is not about speeding up distributed compilations - I'm not sure
> I understand what the problems were you ran into with your distributed
> build that pushed the "raw" files, but with enough caching you can save
> considerable time and processing power that way [1]. Hopefully modules will
> pave the way for an even better distributed C++ build. Well, and better
> linkers...
> >
> > [1]
> http://google-engtools.blogspot.de/2011/09/build-in-cloud-distributing-build-steps.html
> >
> >
>
> My point was that maybe distributed compilation is not what we should
> be aiming for. Modules-based languages such as Java don't have such
> issues because they don't spend their time
> lexing/preprocessing/parsing the same headers over and over; they are
> amenable to saving up the evaluated AST of a module. C++ developers
> have often dreamt, even without modules, that perhaps a sufficiently
> smart compiler process could manage this for header files... the myth
> of the C++ compilation server.
>
> Anyway, I think that clangd could be part of the solution. When using
> Java with Eclipse, the files are compiled in the background, so you
> have to wait less. We could imagine the same possibility with clangd:
> an option so that when the file passes -fsyntax-only, the daemon
> generate the associated .o.
>
> It's not the same idea that distributed compilation, but distributed
> compilation is more about rebuilding from scratch I think, and people
> kinda expect that a build from scratch be long. It's the incremental
> re-compilation that is a pain (when you are working) and I believe
> clangd would be amenable for speeding this up... though it's maybe a
> little early.
>
> -- Matthieu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120617/d72af90a/attachment.html>


More information about the cfe-dev mailing list