[llvm] [llvm-profdata] Remove MD5 collision check in D147740 (PR #66544)

David Li via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 15 13:42:54 PDT 2023


================
@@ -1299,19 +1299,13 @@ raw_ostream &operator<<(raw_ostream &OS, const FunctionSamples &FS);
 /// performance of insert and query operations especially when hash values of
 /// keys are available a priori, and reduces memory usage if KeyT has a large
 /// size.
-/// When performing any action, if an existing entry with a given key is found,
-/// and the interface "KeyT ValueT::getKey<KeyT>() const" to retrieve a value's
-/// original key exists, this class checks if the given key actually matches
-/// the existing entry's original key. If they do not match, this class behaves
-/// as if the entry did not exist (for insertion, this means the new value will
-/// replace the existing entry's value, as if it is newly inserted). If
-/// ValueT::getKey<KeyT>() is not available, all keys with the same hash value
-/// are considered equivalent (i.e. hash collision is silently ignored). Given
-/// such feature this class should only be used where it does not affect
-/// compilation correctness, for example, when loading a sample profile.
+/// All keys with the same hash value are considered equivalent (i.e. hash
+/// collision is silently ignored). Given such feature this class should only be
+/// used where it does not affect compilation correctness, for example, when
+/// loading a sample profile.
 /// Assuming the hashing algorithm is uniform, the probability of hash collision
 /// with 1,000,000 entries is
-/// (2^64)!/((2^64-1000000)!*(2^64)^1000000) ~= 3*10^-8.
+/// 1 - (2^64)!/((2^64-1000000)!*(2^64)^1000000) ~= 3*10^-8.
----------------
david-xl wrote:

Write the general formula is more helpful: 1 - P(S,E)/S^E. where S is the number of slots, and E is the number of entries.

https://github.com/llvm/llvm-project/pull/66544


More information about the llvm-commits mailing list