[PATCH] InstrProf: Calculate a better function hash
Chandler Carruth
chandlerc at google.com
Wed Mar 26 14:58:32 PDT 2014
Generally, this seems fine. Some comments about the patch itself:
+ // Check that we only have six bits.
+ assert(unsigned(Type) < 1u << NumBitsPerType &&
+ "Hash is invalid: too many types");
I would love parentheses around the << operator as otherwise my brain
struggles with the precedence rules.
Also, maybe a complementary static_assert on the enum itself?
+ static_assert(!FunctionLikeDecl, "FunctionLikeDecl should have value 0");
I would find comparing with 0 more clear than using ! on an enum...
On Wed, Mar 26, 2014 at 2:51 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
> On Mar 25, 2014, at 5:39 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com>
> wrote:
>
> > On Mar 25, 2014, at 2:30 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
> >
> >>
> >> On Tue, Mar 25, 2014 at 1:59 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> >> 2. Grab the upper 64 bits of MD5.
> >>
> >> How about this.
> >>
> >> Pack the bits of the enum into a 64-bit integer. When we fill the
> integer, spin up an MD5 context, and shove the integer through it, rinse,
> repeat. When done, if you have an MD5, take the upper 64-bits. If not, just
> take the integer.
> >>
> >> This should incur no cost for the most common case (only a few CFG
> elements) and no collisions roughly ever, and scale nicely due to using
> 64-bit chunks.
> >>
> >> s/MD5/any other stable hashing function really/ -- my hope is that
> after doing this, the performance of the case where all the CFG elements
> didn't just serialize is relatively unimportant.
> >>
> >> Then, benchmark it, and if MD5 is a problem, revisit it with faster
> and/or lower overhead algorithms which still have well known and fixed
> results such as MD4, blake2, etc.
> >
> > Sounds like a great compromise. I’ll get a new patch posted soon.
>
> Thanks again for the reviews. Here’s an updated patch. I chose to leave
> 6 bits
> for the enum to leave space for expansion (so 10 values get packed per
> 64-bits).
>
> > On Mar 25, 2014, at 2:33 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
> >
> >> As a co-worker who also works on hashing just reminded me, this relies
> on the fact that none of the enum values are 0, otherwise you end up with
> 'x' and 'yx' for two sequences which trivially collide.
>
> Note: there is an enum value that’s zero: FunctionLikeDecl. Each time a
> hash is
> calculated, the first (and only the first) counter is assigned to the
> function
> itself (or objective-c block, etc.). The hash logic effectively skips it.
> Only
> subsequent (and non-zero) values get combined.
>
> The enum values that do get combined are all non-zero.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140326/f2ffb3ed/attachment.html>
More information about the cfe-commits
mailing list