[PATCH] D133715: [ADT] Add HashMappedTrie

Duncan P. N. Exon Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 13 14:34:15 PST 2022


dexonsmith added a comment.

In D133715#3993178 <https://reviews.llvm.org/D133715#3993178>, @dblaikie wrote:

> Is it a set or a map?

The interface is a map, just that the key type is restricted to look like a hash, and the client is expected to ensure the keys are well-distributed.

> Not sure how to capture all the other fairly esoteric requirements (no removal, needs perfect hashing, lock free/thread safe, - I guess that's most of/all the user-visible major requirements (as you say, non-iterable is just a minor artifact of the current implementation, not a major defining feature of the interface)). `PerfectHashMap`? (if we were going verbose, `ThreadSafePerfectHashMap`... )

Both names make sense to me. "This is a map from a perfect hash to the data."

Issues I see with those names:

- I worry it's not obvious from either name *when* you'd want to use this. (The answer is, I think, you want this if you are building a set shared between multiple threads, you expect lots of concurrent lookup/insertion, and you want fast insertion/lookup nevertheless.)
- It might sound like the data structure does hashing. (It doesn't. The client is expected to provide the hash.)
- It steals the turf for more general map built on top of this. More later.

Regardless of the name, maybe the programmer's guide could use an explainer.

Note that there's an example data structure called ThreadSafeHashMappedTrieSet buried in the unit tests, which has an interface that hides the hash. Could be useful for some clients.

- Maybe it would make sense to lift up? Name could be PerfectHashSet
- You might want to build a map that also handles the hashing; name could potentially be PerfectHashMap

Maybe this could be PerfectHashSetImpl?

- Has the "guts" for implementing a client-friendly PerfectHashSet or PerfectHashMap?
- ... but can be used directly if you want to manage the hashing yourself?



================
Comment at: llvm/unittests/ADT/HashMappedTrieTest.cpp:241-245
+using HasherT = SHA1;
+using HashType = decltype(HasherT::hash(std::declval<ArrayRef<uint8_t> &>()));
+template <class T>
+class ThreadSafeHashMappedTrieSet
+    : ThreadSafeHashMappedTrie<T, sizeof(HashType)> {
----------------
FYI, here, buried in the unit tests (demoted due to lack of immediate use cases), this is a set data structure built on top of the trie.

If this were lifted up again for actual use, as opposed to just a test, you'd want to template it on `HasherT` and then use HashBuilder (from llvm/include/llvm/Support/HashBuilder.h) to hash the value.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133715/new/

https://reviews.llvm.org/D133715



More information about the llvm-commits mailing list