On Mon, Feb 20, 2012 at 10:38 AM, Jay Foad <span dir="ltr"><<a href="mailto:jay.foad@gmail.com">jay.foad@gmail.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


On 18 February 2012 21:00, Talin <<a href="mailto:viridia@gmail.com">viridia@gmail.com</a>> wrote:<br>

> Added: llvm/trunk/include/llvm/ADT/Hashing.h<br>

<br>

Nitpick: wouldn't this be better in ../Support ?<br></blockquote><div><br></div><div>I'm not sure what the boundary between ADT and Support is - I put it in ADT because containers use it. However, if someone moves it to Support I won't complain. </div>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

> +  /// Add a float<br>

> +  GeneralHash& add(float Data) {<br>

> +    union {<br>

> +      float D; uint32_t I;<br>

> +    };<br>

> +    D = Data;<br>

> +    addInt(I);<br>

> +    return *this;<br>

> +  }<br>

> +<br>

> +  /// Add a double<br>

> +  GeneralHash& add(double Data) {<br>

> +    union {<br>

> +      double D; uint64_t I;<br>

> +    };<br>

> +    D = Data;<br>

> +    addInt(I);<br>

> +    return *this;<br>

> +  }<br>

<br>

IMO it would be better not to implement these at all until someone<br>

needs them, and decides what to do about the +/- 0 problem. (But<br>

that's just another nitpick!)<br>

<div class="im"><br></div></blockquote><div>For uniquing of ConstantFPs, I think you do want to distinguish between +/- 0.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="im">

> +// Add a possibly unaligned sequence of bytes.<br>

> +void GeneralHash::addUnaligned(const uint8_t *I, const uint8_t *E) {<br>

> +  ptrdiff_t Length = E - I;<br>

> +  if (uintptr_t(I) & 3 == 0) {<br>

</div><div class="im">> +    while (Length > 3) {<br>

> +      mix(*reinterpret_cast<const uint32_t *>(I));<br>

> +      I += 4;<br>

> +      Length -= 4;<br>

> +    }<br>

> +  } else {<br>

> +    while (Length > 3) {<br>

> +      mix(<br>

> +        uint32_t(I[0]) +<br>

> +        (uint32_t(I[1]) << 8) +<br>

> +        (uint32_t(I[2]) << 16) +<br>

> +        (uint32_t(I[3]) << 24));<br>

> +      I += 4;<br>

> +      Length -= 4;<br>

> +    }<br>

> +  }<br>

<br>

</div>I think there's a serious problem here on big-endian hosts, because<br>

identical arrays of bytes will hash to different values depending on<br>

whether they happen to start on a 4-byte boundary or not.<br></blockquote><div><br></div><div>Good catch. Shouldn't be hard to fix I would think. </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

Thanks for working on this!<br>

<font color="#888888">Jay.<br>

</font></blockquote></div><br><br clear="all"><div><br></div>-- <br>-- Talin<br>