[PATCH] InstrProf: Calculate a better function hash

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed Mar 26 14:51:43 PDT 2014


On Mar 25, 2014, at 5:39 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:

> On Mar 25, 2014, at 2:30 PM, Chandler Carruth <chandlerc at google.com> wrote:
> 
>> 
>> On Tue, Mar 25, 2014 at 1:59 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>  2. Grab the upper 64 bits of MD5.
>> 
>> How about this.
>> 
>> Pack the bits of the enum into a 64-bit integer. When we fill the integer, spin up an MD5 context, and shove the integer through it, rinse, repeat. When done, if you have an MD5, take the upper 64-bits. If not, just take the integer.
>> 
>> This should incur no cost for the most common case (only a few CFG elements) and no collisions roughly ever, and scale nicely due to using 64-bit chunks.
>> 
>> s/MD5/any other stable hashing function really/ -- my hope is that after doing this, the performance of the case where all the CFG elements didn't just serialize is relatively unimportant.
>> 
>> Then, benchmark it, and if MD5 is a problem, revisit it with faster and/or lower overhead algorithms which still have well known and fixed results such as MD4, blake2, etc.
> 
> Sounds like a great compromise.  I’ll get a new patch posted soon.

Thanks again for the reviews.  Here’s an updated patch.  I chose to leave 6 bits
for the enum to leave space for expansion (so 10 values get packed per 64-bits).

> On Mar 25, 2014, at 2:33 PM, Chandler Carruth <chandlerc at google.com> wrote:
> 
>> As a co-worker who also works on hashing just reminded me, this relies on the fact that none of the enum values are 0, otherwise you end up with 'x' and 'yx' for two sequences which trivially collide.

Note:  there is an enum value that’s zero:  FunctionLikeDecl.  Each time a hash is
calculated, the first (and only the first) counter is assigned to the function
itself (or objective-c block, etc.).  The hash logic effectively skips it.  Only
subsequent (and non-zero) values get combined.

The enum values that do get combined are all non-zero.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgo-hash-2.patch
Type: application/octet-stream
Size: 32365 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140326/5e36784c/attachment.obj>


More information about the cfe-commits mailing list