[PATCH] D132455: [ADT] add ConcurrentHashtable class.

Thu Mar 16 05:32:44 PDT 2023

avl added a comment.

In D132455#4197342 <https://reviews.llvm.org/D132455#4197342>, @JDevlieghere wrote:

> In D132455#4106255 <https://reviews.llvm.org/D132455#4106255>, @avl wrote:
>
>> ping.
>>
>> @JDevlieghere @aprantl Do you think it is better to move this ConcurrentHashtable into the DWARFLinkerParallel folder?
>
> I'm fine either way. If someone has concerns about the implementation, it's probably less contentious to land it in the DWARFLinkerParallel first, but so far I've not heard any objections. If we do want to use this from LLDB, then we'll need it to be in ADT eventually, though I'd like to do a comparison with Steven's `HashMappedTrie` for LLDB's real world usage.

I have the utility to do the comparision - https://reviews.llvm.org/D132548 The problem is that `HashMappedTrie` does not resolve hash collisions. it is assumed that such collisions would be resolved by the client of `HashMappedTrie`. I am not sure what is the good way to resolve such collisions. Thus, I do not know how to make a "fair" comparision at the moment.

================
Comment at: llvm/include/llvm/ADT/ConcurrentHashtable.h:297
+    assert((CurBucket.Size > 0) && "Uninitialised bucket");
+    if (CurBucket.NumberOfEntries < CurBucket.Size * 0.9)
+      return;
----------------
JDevlieghere wrote:
> What's the benefit of rehashing at 90% capacity? It seems like this is going to always leave a few empty slots on the table? I understand you always need to have one slot because you rehash after insertion, but it seems like you could rehash here rehash when you've exhausted the bucket? 
When the hashtable is nearly 100% full then it needs to pass too many entries while searching for the free slot. the worst scenario is if the bucket is of 1000000 entries size and it already has 999999 entries then it might need to enumerate all 999999 entries, which is slow.  
In case bucket is of 1000000 entries size and it is 90% full the number of entries which should be enumerated is smaller. 

So wasting 10% of memory allows to have 20% performance improvement. The exact value "0.9" is received while experimenting. Probably it would be good to have a possibility to change this value (for the case when memory is more important).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132455/new/

https://reviews.llvm.org/D132455