[PATCH] D99279: [lld-macho] Parallelize UUID hash computation

Wed Mar 24 12:02:57 PDT 2021

int3 added a comment.

> Would you mind adding the file size and the number of cores to the summary?

Ah yeah I'd forgotten to do that, good catch

================
Comment at: lld/MachO/Writer.cpp:921-925
+  std::vector<uint64_t> hashes(chunks.size());
+  parallelForEachN(0, chunks.size(),
+                   [&](size_t i) { hashes[i] = xxHash64(chunks[i]); });
+  uint64_t digest = xxHash64({reinterpret_cast<uint8_t *>(hashes.data()),
+                              hashes.size() * sizeof(uint64_t)});
----------------
oontvoo wrote:
> Looking at the implementation of xxHash64, I think the `reinterpret_cast<uint8_t *>` is UB. 
> Specifically, it tried to iterate the array by `+8`.
> 
> How about this?
> Looking at the implementation of xxHash64, I think the reinterpret_cast<uint8_t *> is UB.
> Specifically, it tried to iterate the array by +8.

Can you elaborate? I'm not sure I follow.

As for using `write64le` to write the intermediate values, I know LLD-ELF does it, but seems pretty superfluous to me. I guess it means that we'll get the same hash values if we run LLD on a big-endian host machine, but ... when will that ever happen? And if someone tried that, I'm pretty sure lots of things would break anyway, since we don't have any contbuilds covering this.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99279/new/

https://reviews.llvm.org/D99279