[PATCH] D30555: LTO: Hash type identifier resolutions for WholeProgramDevirt.

Fri Mar 3 09:45:02 PST 2017

pcc added inline comments.

================
Comment at: llvm/lib/LTO/LTO.cpp:104
+    Hasher.update(ArrayRef<uint8_t>{Data, 8});
+  };
   AddString(Conf.CPU);
----------------
mehdi_amini wrote:
> pcc wrote:
> > mehdi_amini wrote:
> > > Why do we hash bit-by-bit? Is it because of endianness? Do we care about having the same hash on different endian machine?
> > It makes the code easier to reason about if you always use the same hash function rather than making the code machine dependent.
> I don't really get why the code here is "easier to reason about" than:
> 
> ```
> auto AddUint64 = [&](uint64_t I) {
>     Hasher.update(&I, sizeof(I));
>   };
> ```
If you reuse the cache on an opposite endian host you'd need to worry about collisions if all integers are byte swapped. I reasoned through this and figured out that it probably can't happen in practice, but the point is that we can avoid needing to rely on such reasoning.

Also, your code breaks strict aliasing rules. But hey, everyone breaks them so maybe it doesn't matter. I don't care that much so if you feel strongly I'll just change it.

https://reviews.llvm.org/D30555