[LLVMdev] ADT/Hashing.h on 32-bit platforms

Chandler Carruth chandlerc at google.com
Mon Feb 3 14:10:19 PST 2014


A couple of minor comments about the details of the patch:

-  hash_state state = state.create(s_begin, seed);
+  hash_state state = hash_state::create(s_begin, seed);

Why this change? The existing code is correct already I think?

-  return ::llvm::hashing::detail::hash_integer_value(value);
+  // the cast is necessary for strong enums and avoids compiler warnings

Generally LLVM comments should be formed as prose (starting with a
capitalized word and ending with appropriate punctuation). I don't think
you need a comment here though, or a separate variable. I would just
explicitly cast the value in the argument.

+  const uint64_t n = static_cast<uint64_t>(value);
+  return ::llvm::hashing::detail::hash_integer_value(n);




On Mon, Feb 3, 2014 at 1:46 PM, Stephan Tolksdorf <st at quanttec.com> wrote:

> On 02.02.14 00:48, Chandler Carruth wrote:
>
>> On Sat, Feb 1, 2014 at 8:35 AM, Stephan Tolksdorf <st at quanttec.com
>> <mailto:st at quanttec.com>> wrote:
>>
>>     Hi,
>>
>>     Currently the hashing implementation in ADT/Hashing.h produces hash
>>     values on 32-bit platforms that differ from the lower 32-bits of the
>>     hash values produced on 64-bit platforms. It seems the only reason
>>     for this difference is that the uint64_t integer seed is truncated
>>     to size_t. Since the usage of uint64_t and size_t as types for seed
>>     values in the implementation is somewhat inconsistent, I'm
>>     wondering: Was this difference deliberately introduced as a
>>     performance optimization for 32-bit code, or is this maybe just an
>>     implementation artifact?
>>
>>     I also noted that the implementation relies on implicit conversions
>>     from uint64_t to size_t in several places where hash_code values are
>>     returned, which may trigger compiler warnings on 32-bit platforms.
>>
>>
>> Almost all of this was largely incidental on my part. I didn't spend a
>> lot of time tuning the 32-bit platform performance. If you have ideas
>> that you think would make things better, I'm certainly interested. =D
>>
>> One thing I would note is that I would prefer to retain the collision
>> resistance where possible. So I would be leery of reducing the internal
>> state used. I would more focus on making the outputs friendly for 32-bit
>> platforms and the inputs consistent.
>>
>
> I've attached a patch that makes the hashing implementation consistently
> use a 64-bit seed everywhere.  With this change the implementation should
> produce the same hash codes modulo SIZE_MAX + 1 for identical values
> independent of the platform. (Though hash_combine may still produce
> different results if the size of an argument type differs between
> platforms). I suspect the negative performance impact on 32-bit platforms
> should be small, but I didn't do any benchmarking. With atomics one could
> probably replace the thread safe local static in get_execution_seed with
> something that has a little less overhead.
>
> The patch also removes a FIXME from the set_fixed_execution_seed
> implementation and rewords the documentation string a bit, since using this
> function in a multi-threaded program requires some kind of external
> synchronization anyway.
>
> Another attached patch implements llvm::sys::Process::GetRandomNumber()
> on Windows. (There's already a Unix implementation.) This function could be
> useful for implementing the missing random per-execution seed
> functionality. (Though it might take a little care to deal with
> Process::GetRandomNumberSeed potentially calling back into hash_combine on
> Unix.)
>
> In a little experiment I made, changing the seed per execution seemed to
> lead to test suite errors for clang's PCH support and other components.
>
> - Stephan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/373a6367/attachment.html>


More information about the llvm-dev mailing list