[PATCH] D30555: LTO: Hash type identifier resolutions for WholeProgramDevirt.

Fri Mar 3 09:46:59 PST 2017

mehdi_amini accepted this revision.
mehdi_amini added inline comments.
This revision is now accepted and ready to land.

================
Comment at: llvm/lib/LTO/LTO.cpp:104
+    Hasher.update(ArrayRef<uint8_t>{Data, 8});
+  };
   AddString(Conf.CPU);
----------------
pcc wrote:
> mehdi_amini wrote:
> > pcc wrote:
> > > mehdi_amini wrote:
> > > > Why do we hash bit-by-bit? Is it because of endianness? Do we care about having the same hash on different endian machine?
> > > It makes the code easier to reason about if you always use the same hash function rather than making the code machine dependent.
> > I don't really get why the code here is "easier to reason about" than:
> > 
> > ```
> > auto AddUint64 = [&](uint64_t I) {
> >     Hasher.update(&I, sizeof(I));
> >   };
> > ```
> If you reuse the cache on an opposite endian host you'd need to worry about collisions if all integers are byte swapped. I reasoned through this and figured out that it probably can't happen in practice, but the point is that we can avoid needing to rely on such reasoning.
> 
> Also, your code breaks strict aliasing rules. But hey, everyone breaks them so maybe it doesn't matter. I don't care that much so if you feel strongly I'll just change it.
Right I wrote the code as a snippet.
And no I don't feel strongly about this, but I like to understand what's going on :)

https://reviews.llvm.org/D30555