[PATCH] D153692: Fixed D147740 - [llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map

Wenlei He via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jun 25 16:34:12 PDT 2023


wenlei added a comment.

Copying comments from the original patch for continuation..

> In D147740#4443233 <https://reviews.llvm.org/D147740#4443233>, @huangjd wrote:
> Actually do we really care about MD5 collision? ExtBinary format already ignored MD5 collision for regular string names (and therefore regular function profiles), as only one of two functions with colliding MD5 get written to the name table (and the other is therefore lost). If we are using CS profiles, since different CS profiles have different serializations, their > hashes are distributed as expected. The most important thing is that, even if we detect a hash collision, we can't do anything about it except logging it (using a multi-map makes the reader much slower), so I think the MD5 collision check should be marked as LLVM_DEBUG. This does reduce 0.5 second out of ~30 seconds (1.67%) over the 1 GB profile read .

I don't think we care. Is the new type HashKeyMap and SampleProfileMap all for detecting and reporting collision? I'd avoid all that complexity and prefer a simple DenseMap + a SampleContext->hash_code converter and not even bother with debug prints for collision...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153692/new/

https://reviews.llvm.org/D153692



More information about the llvm-commits mailing list