[PATCH] InstrProf: Calculate a better function hash

Chandler Carruth chandlerc at google.com
Wed Mar 26 14:58:32 PDT 2014


Generally, this seems fine. Some comments about the patch itself:

+  // Check that we only have six bits.
+  assert(unsigned(Type) < 1u << NumBitsPerType &&
+         "Hash is invalid: too many types");

I would love parentheses around the << operator as otherwise my brain
struggles with the precedence rules.

Also, maybe a complementary static_assert on the enum itself?

+  static_assert(!FunctionLikeDecl, "FunctionLikeDecl should have value 0");

I would find comparing with 0 more clear than using ! on an enum...



On Wed, Mar 26, 2014 at 2:51 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

> On Mar 25, 2014, at 5:39 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com>
> wrote:
>
> > On Mar 25, 2014, at 2:30 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
> >
> >>
> >> On Tue, Mar 25, 2014 at 1:59 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> >>  2. Grab the upper 64 bits of MD5.
> >>
> >> How about this.
> >>
> >> Pack the bits of the enum into a 64-bit integer. When we fill the
> integer, spin up an MD5 context, and shove the integer through it, rinse,
> repeat. When done, if you have an MD5, take the upper 64-bits. If not, just
> take the integer.
> >>
> >> This should incur no cost for the most common case (only a few CFG
> elements) and no collisions roughly ever, and scale nicely due to using
> 64-bit chunks.
> >>
> >> s/MD5/any other stable hashing function really/ -- my hope is that
> after doing this, the performance of the case where all the CFG elements
> didn't just serialize is relatively unimportant.
> >>
> >> Then, benchmark it, and if MD5 is a problem, revisit it with faster
> and/or lower overhead algorithms which still have well known and fixed
> results such as MD4, blake2, etc.
> >
> > Sounds like a great compromise.  I’ll get a new patch posted soon.
>
> Thanks again for the reviews.  Here’s an updated patch.  I chose to leave
> 6 bits
> for the enum to leave space for expansion (so 10 values get packed per
> 64-bits).
>
> > On Mar 25, 2014, at 2:33 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
> >
> >> As a co-worker who also works on hashing just reminded me, this relies
> on the fact that none of the enum values are 0, otherwise you end up with
> 'x' and 'yx' for two sequences which trivially collide.
>
> Note:  there is an enum value that’s zero:  FunctionLikeDecl.  Each time a
> hash is
> calculated, the first (and only the first) counter is assigned to the
> function
> itself (or objective-c block, etc.).  The hash logic effectively skips it.
>  Only
> subsequent (and non-zero) values get combined.
>
> The enum values that do get combined are all non-zero.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140326/f2ffb3ed/attachment.html>


More information about the cfe-commits mailing list