[llvm-commits] [llvm] r150890 - in /llvm/trunk: include/llvm/ADT/Hashing.h lib/Support/CMakeLists.txt lib/Support/Hashing.cpp unittests/ADT/HashingTest.cpp unittests/CMakeLists.txt

Mon Feb 20 10:38:23 PST 2012

On 18 February 2012 21:00, Talin <viridia at gmail.com> wrote:
> Added: llvm/trunk/include/llvm/ADT/Hashing.h

Nitpick: wouldn't this be better in ../Support ?

> +  /// Add a float
> +  GeneralHash& add(float Data) {
> +    union {
> +      float D; uint32_t I;
> +    };
> +    D = Data;
> +    addInt(I);
> +    return *this;
> +  }
> +
> +  /// Add a double
> +  GeneralHash& add(double Data) {
> +    union {
> +      double D; uint64_t I;
> +    };
> +    D = Data;
> +    addInt(I);
> +    return *this;
> +  }

IMO it would be better not to implement these at all until someone
needs them, and decides what to do about the +/- 0 problem. (But
that's just another nitpick!)

> +// Add a possibly unaligned sequence of bytes.
> +void GeneralHash::addUnaligned(const uint8_t *I, const uint8_t *E) {
> +  ptrdiff_t Length = E - I;
> +  if (uintptr_t(I) & 3 == 0) {
> +    while (Length > 3) {
> +      mix(*reinterpret_cast<const uint32_t *>(I));
> +      I += 4;
> +      Length -= 4;
> +    }
> +  } else {
> +    while (Length > 3) {
> +      mix(
> +        uint32_t(I[0]) +
> +        (uint32_t(I[1]) << 8) +
> +        (uint32_t(I[2]) << 16) +
> +        (uint32_t(I[3]) << 24));
> +      I += 4;
> +      Length -= 4;
> +    }
> +  }

I think there's a serious problem here on big-endian hosts, because
identical arrays of bytes will hash to different values depending on
whether they happen to start on a 4-byte boundary or not.

Thanks for working on this!
Jay.