[cfe-dev] [RFC] cHash: Integration of AST hashing

Christian Dietrich via cfe-dev cfe-dev at lists.llvm.org
Thu Aug 3 04:39:14 PDT 2017


Alex Lorenz via cfe-dev <cfe-dev at lists.llvm.org> writes:

> Hey,
>
> What impact does the hashing itself have on the build time? I.e. If I
> were to fully rebuild a project from scratch, how much longer would it
> take?

We have build over 2300 commits from the musl libc library. In total
that are 5.6 million compiler invocations (all commits are built from
scratch). On average, the AST hashing took 9ms. In comparision, the
parsing is 187ms and the -O2 optimization and assembling 1082ms. So the
hashing time alone is almost negliable. Furthermore, the hashing time
does not depend on the syntactic elements in the AST, but only on the
used parts. So even if you have a thousands of type declarations, a file
with a single 'void foo() {}' takes no time.

We use a non-cryptographical hash (MurMur2) to speed things up a little
bit. And since we do not have to defend against evil attackers, we
believe a non-cryptographic hash is enough. However, for our current
implementation, we still have a few optimizations that could cut down a
little bit on the hashing[1].

chris

[1] We hash currently every type annotation for every AST note. However,
    only the type fields for variable declarations and function
    signatures are relevant, since all other type fields are derived
    from them in a determinsitic fashion.
-- 
Christian Dietrich, M.Sc. (Scientific Staff)
Institute for Systems Engineering (Systems and Computerarchitecture)
Leibniz Universität Hannover
Appelstraße 4
30167 Hannover, Germany

Tel:    +49 511 762-19737
Fax:    +49 511 762-19733
eMail:  dietrich at sra.uni-hannover.de
WWW:    https://www.sra.uni-hannover.de



More information about the cfe-dev mailing list