[PATCH] InstrProf: Calculate a better function hash

Chandler Carruth chandlerc at google.com
Tue Mar 25 12:31:01 PDT 2014


On Tue, Mar 25, 2014 at 12:11 PM, Justin Bogner <mail at justinbogner.com>wrote:

> Chandler Carruth <chandlerc at google.com> writes:
> > FNV is actually based on the same principles as Bernstein's -- it is
> relying
> > on multiplication to spread the bits throughout an integers state, and
> xor (or
> > addition as you originally wrote the patch, many variations on
> Bernstein's use
> > xor though).
> >
> > These all will have reasonably frequent collisions in addition to be
> poorly
> > distributed over the space. You've indicated you don't care about the
> > distribution, but do care about collisions.
> >
> > Also, you've asserted speed claims without data. Both Bernstein's hash
> (in its
> > original formulation, your code was actually a strange variant of it that
> > didn't operate on bytes or octets) and FNV are necessarily a
> byte-at-a-time
> > and thus *quite* slow for inputs of even several hundered bytes.
> >
> > We actually have a variation of CityHash that I implemented which is a
> bit
> > faster than CityHash (and for strings of bytes more than 128 bytes,
> several
> > times faster than Bernstein's) but has similarly strong collision
> resistance.
> >
> > But how much data are we talking about? And how frequently are you
> computing
> > this? MD5 is actually reasonably fast on modern hardware. The reference
> > benchmarks have shown roughly 500 cycles to compute the MD5 of an 8-byte
> > message, and 800 or 900 cycles to compute the MD5 of a 64-byte message. I
> > would expect traversing the AST to build the inputs for this to be
> > significantly slower due to cache misses, but I think benchmarks would
> help
> > here.
>
> This won't be much data per hash, but it needs to be calculated once per
> function being compiled. My gut says any of these hashes will be
> sufficient, but it'll help to describe the problem domain.
>

I've read all the patch and am fairly familiar with the design...


> The hash is based on the structure of the AST for a function, so:
>
> - The input domain is a set of "interesting" statements and decls, which
>   Duncan's patch represents in the ASTHash enum. There are currently 16
>   distinct values, and I wouldn't expect this to grow much.
>
> - The length of the input is directly correlated with the amount of
>   control flow in a function.  This will often be quite short (a few if
>   statements and loops, say) but may be quite long in the presence of a
>   giant state machine implemented by a switch, or some other monstrous
>   function. I'd expect this to usually be counted in tens, rather than
>   hundreds, and it won't be uncommon for it to be one or two.
>

Yep.


>
> - The collisions we're concerned about are ones that are likely to occur
>   from a function being changed. When someone modifies the control flow
>   of a function, our profile is no longer valid for that function. We
>   also check the number of counters though, so only collisions for the
>   same length of input matter at all.
>

Yes, so things like single bit flips in the message.


>
> This sounds to me like it would be pretty similar to hashing short
> strings, which bernstein is generally considered to be reasonably good
> at IIRC, but I'm no expert on the subject.


Bernstein's hash happens to be effective of short strings of *ascii*
printable characters. It is not highly rilient to single bit flips in all
bits of the input. Notably, as pointed out by Bob Jenkins and others, there
is a funnel where 0x21 and 0x100 have the same hash (33). Bernstein's hash
became popular in no small part because it has unusual properties with
*ascii* text: lowercase alpha strings of 6 characters or smaller have zero
collisions in 32-bits. I don't see any way that it is a useful hashing
algorithm for something like this.

FNV is slightly better in that it works for any byte stream rather than
being carefully chosen to work with ascii characters. This is primarily
because it uses a prime multiplier. However, it *requires* fast integer
multiplies (and thus is often quite slow on non-Intel chips) and has a
tendency to scale *very* poorly to large input messages due to the
byte-stream nature of the beast.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140325/b0c9dc77/attachment.html>


More information about the cfe-commits mailing list