[cfe-dev] zapcc compiler

Sat May 23 12:25:51 PDT 2015

Hi Chris,

I am the prinicipal developer of zapcc and can add some more tech details.
zapcc is heavily-modified clang (the diff is about 200K) with additional
code outside the llvm/clang codebase. zapcc operates in a
client-compilation server mode. The compilation server (think of it as
clang -cc1) stays in memory and accepts compilation commands from the
driver. The client runs until cc1_main() which communicates with the server
rather than rerunning another clang as usual.

zapcc makes distinction between two classes of source files, the "system"
ones of which all compilation state is kept in memory and the "user" ones
whose compilation state is removed once compiled. The programmer can select
which are the "user" files by wildcards set in a configuration files. The
default of user is .c .cpp .cxx .CC files but it could easily be all files
in /home/user/yaron or whatever. It is expected that the system files are
non-changing (such a change will not be recognized anyhow until server
restart) while the user files are the ones to be modified. As an example,
you could have llvm/lib/MC/MachObjectWriter.cpp as the "user" file so every
other file compilation result would be kept in memory.

Not only a header file is parsed once but all its templates instantiations
*and* generated code are kept memory ready for the next compilation. zapcc
is very carefull to undo anything releated to the 'user' files in
clang/LLVM data structures,This is very very complex, which is why zapcc is
not yet ready for public beta. We prefer to release a more reliable product
rather than waste your time.

There are limitations with this approach, as previously declared entities
are still visible in subsequent compilations, a limitation we hope to
address someday, not in the near future. With good quality modern codebase
such clashes are rare. In the LLVM/clang codebased there are just a few
clashes which can be easily fixed by renaming one of the clashing entities.
Some of the renaming would be according to the new codestyle anyhow... In
such cases zapcc automatically resets the compilation cache and retries
compilation before giving up. It also resets if compilation flags change or
in some situations it finds out it can't undo the compilation.

Having everything ready in-memory saves time, especially where the headers
are much more complex than the source code. With a short C++ program using
boost::numeric, boost::graph etc or Eigen, we see a 10-50x speedup. We had
some code examples on the web site which I asked to be currently removed
now until we can provide you with a beta release so that the results could
be independently replicated. These may be considered best-case examples but
are actally useful for programmers modifying and rebuilding a smaller
program based on heavy templated C++ infrastructure.

For full-build LLVM compilation we don't yet have full results as not all
zapcc bugs are solved, but we do see about 1.5x speedups building until 55%
build or so. This timing includes some linking and tablegenning which just
the same using zapcc, so compilation speedup is actully somewhat better.

We haven't compared with precompiled headers as they are really not
equivalent. Using precomp headers is non-trivial change to a project build
and will not always help build time ,depending on include patterns. I'm not
sure precomp headers would benefit LLVM build time. OTOH, zapcc builds the
project as-is without redesign, with the exception of renaming name
clashes, a trivial refactoring.

Hoping to release a beta version soon,

Yaron

2015-05-23 21:05 GMT+03:00 Chris Lattner <clattner at apple.com>:

> Does anyone know anything about this?
> http://www.zapcc.com
>
> -Chris
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150523/82829725/attachment.html>