[cfe-dev] token caching ("PCH")
kremenek at apple.com
Tue Oct 21 10:43:46 PDT 2008
On Oct 21, 2008, at 9:24 AM, Cédric Venet wrote:
>> Added the start of a prototype implementation of PCH based on token
> I only use precompiled header with visual studio. VS requirement for
> precompiled headers is that they must be included first and always
> in the same order (so I usually use one stdafx.h file to group them
> all). This seems like a minimal requirement for PCH and your
> solution need to follow it too (since the token are cached after
> been preprocessed isn't it?). Then, why not simply serialize the AST
> at the end of the precompiled headers and reload it when needed? it
> would allow to save parse and sema time instead of just file
> reading, lex and preprocessing (Do you have some figure on what take
> most of the time? I would have said sema, at least for C++, but
> perhaps readings the files is the more time consuming task).
> Perhaps the reponse is simply that AST serialization is still
> broken, then ignore me :)
Redirecting to cfe-dev.
These are fair questions. The short answer is that this is a stepping
stone towards other forms of PCH support in Clang.
Here is the long answer.
In the long term the leading idea on the table is to implement PCH via
lazy deserialization of ASTs. This means you only deserialize the
pieces of the header that you actually need. Beyond requiring
serialization support to be fully implemented, there are many details
to get right. It's a complicated solution that will take some time to
implement. In the meantime, we would like to have other forms of
"PCH" available today (which also may be compatible with other PCH
solutions we implement).
The approach we are investigating right now is "token caching". The
idea is to cache un-preprocessed (raw) tokens included from all the
files sucked in from the prefix header. The cached tokens can then be
used in a manner similar to how precompiled headers are used today:
instead of lexing from the headers, we simply read in the tokens. The
idea is to drastically reduce lexing time.
Here is the motivation for this approach.
We did some extensive timings of Clang. Clang is 2x faster than GCC
without PCH, and GCC with PCH is (on average) 2x faster than Clang.
This is on Mac OS X where we have some really huge headers (including
Cocoa.h is about 11MB of source code).
The majority of the work (50-60%) at -fsyntax-only is spent lexing.
Thus simply speeding up lexing significantly will have a substantial
performance win. Moreover, a significant amount of time is spent just
stat'ing header files, opening them, reading them. Placing the cached
tokens from a prefix header can remove a significant amount of time
from lexing as well as remove a significant amount of overhead from
system calls. Further, caching the lexed tokens is a simple solution
that is trivial to implement, which means we will have something that
works in the short term while we investigate other (complementary)
As to your point about simply deserializing all the ASTs in a header,
we (Daniel, Steve, Chris, and I) believe that would give roughly the
same performance as the token caching approach. There are three
arguments for this.
First, the time spent in Sema itself is largely spent allocating
objects, and performing a deserialization of all the ASTs will recover
only a fraction of time from Sema as we need to allocate the same
number of objects. While it would also recover parse time, only 10%
of -fsyntax-only is spent doing parsing. Also, object deserialization
isn't free (there is computation involved in reading VBR encoded
integers, parsing bitcode records, etc.), so any time recovered from
removing parsing would probably be replaced by work doing
Second, deserializing all of the ASTs in the prefix header would also
be far more complicated than caching tokens; not only would
serialization need to be feature complete (which we plan on doing
anyway) but more aspects of Clang, e.g. Sema, would need to be aware
of our PCH solution. The token caching approach means that only the
Preprocessor needs to be aware that it is getting its tokens from
another source; Parse and Sema remain unaffected.
Third, token caching is also something we can probably implement in a
week or so, so it is something we can quickly evaluate to see if it
buys us anything. Further, token caching has some other incidental
benefits; for example, we will be able to get better measurements of
how much time we spend in preprocessing.
As I mentioned earlier, the ultimate solution (as currently
conceptualized) for PCH in Clang is to indeed use AST serialization,
but only to deserialize the pieces of the header that are needed.
This requires the ability to do lazy deserialization of ASTs in Sema,
something which our current infrastructure just doesn't really support
right now. Moreover, C++ will bring a whole host of other
complexities when considering fast PCH support, so there is some
benefit in waiting until C++ support progresses a little further so we
can a better idea of the right design tradeoffs for PCH support for C++.
More information about the cfe-dev