[cfe-dev] Diagnostic IDs, parsing speed and how to generate lookup tables

Manuel Klimek klimek at google.com
Thu Dec 13 01:46:43 PST 2012


On Wed, Dec 12, 2012 at 11:45 PM, John McCall <rjmccall at apple.com> wrote:

> On Dec 12, 2012, at 6:00 AM, Manuel Klimek <klimek at google.com> wrote:
>
> while doing some parsing speed investigation I noted that a lot of
> diagnostics stuff is in the very hot path for parsing time. One thing that
> fell out of that is Benjamin's recent patch that speeds up getting the
> diagnostic info. But, there's more to be gotten - with a simple cache for
> the diagnotic classes, I was able to get another 1.5% parsing speedup
> (benchmarked over all google code).
>
> Unfortunately the patch in its current state is not thread safe; the
> obvious solution to that problem would be to generate the diagnostic class
> table statically instead of writing to a cache at runtime.
>
> This turns out to be surprisingly hard though - the diagnostic id start
> values are defined in an enum, while the actual diagnostics come from the
> .td files.
>
> A first idea was that we might define the start values inside the .td
> files, and create the enum from that, instead of the other way around.
>
> That led to the question posed by Dmitri on irc why there are start ranges
> in the first place - we could instead tablegen all diagnostics at once and
> let the tablegen take care of generating the appropriate enum values for
> the start of certain ranges.
>
> So if anybody has better ideas for how to solve the lookup problem, or
> knows why the diagnostic ids have fixed start values, I'd be very
> interested to learn more about it :)
>
>
> The purpose of the fixed starting values is just compile-time efficiency:
> we'd like to be able to add/remove diagnostics (generally in one part of
> clang) without requiring a full recompile.  As long as your scheme still
> allows this in most cases — maybe tblgen gets re-run for all diagnostics
> whenever any of them change, but tblgen rounds up to the next multiple of
> 50 between ranges so that adding a diagnostic in one range doesn't usually
> force a full recompile — I think we're fine.
>

Of the Diagnostic*Kinds.td files the only one that gets updated fairly
regularly (every 1-2 days) is DiagnosticSemaKinds.td. Can't we just put it
last, and save the hassle of putting holes into the generated lookup
tables? (The next highest frequent change is ~4-5 times per month; in that
time frame a full rebuild is in order anyway).

Cheers,
/Manuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20121213/0039657d/attachment.html>


More information about the cfe-dev mailing list