[PATCH] D125979: [ADT] add ConcurrentHashTable class.

Wed Jun 1 07:49:17 PDT 2022

avl added inline comments.

================
Comment at: llvm/include/llvm/ADT/ConcurrentHashTable.h:117-118
+    else if (CurSlabNum > MaxNumberOfSlabs)
+      reportWarning("Too many slabs. Consider increasing initial size or "
+                    "changing hash function.");
+
----------------
MaskRay wrote:
> avl wrote:
> > dblaikie wrote:
> > > avl wrote:
> > > > dblaikie wrote:
> > > > > It's pretty weird to have warnings coming out of data structures. @MaskRay would it be reasonable to remove these/replace them with some other tool if we're promoting this to a more general purpose data structure rather than something more special-case in lld?
> > > > Probably, instead of warning we can introduce method:
> > > > 
> > > > ```
> > > > bool IsInGoodState () const {
> > > >   return CurSlabNum <= MaxNumberOfSlabs && !hasFullSlab;
> > > > }
> > > > 
> > > > 
> > > > ```
> > > > 
> > > > the consumer could call this method somewhere and inform the user:
> > > > 
> > > > ```
> > > > if (!HashTable.IsInGoodState()) {
> > > >     outs() << "Hash table is in suboptimal state";
> > > > }
> > > > ```
> > > > 
> > > > ?
> > > Maybe? I'd be curious to hear from @MaskRay here, since I guess he wrote the original/had the original motivation for the warning.
> > that is me, who introduced the warning. Original lld implementation does not have it.
> > 
> > My motivation is that it should be known somehow that hashtable is not in good state and it is necessary to revise initial size.
> The root issue is that the patch does not implement proper grow/resize, so setting a capacity and having oversize warnings here look like hacks. For many general-purpose usages it's difficult to estimate the size beforehand, or without heavily refactoring the usage pattern. So I wish that we implement a proper resizable hashtable with amortized O(1) time complexity, not the current version which will be slow if the actual size is much more larger than the reserved capacity.
This is correct. This implementation is not a good choice for the case when it is difficult to estimate the size beforehand.

But this might be a good choice for the cases when such estimation is possible. F.e. lld/COFF already uses
this implementation even without the resizing feature. I used it for DWARF files. All clang/llvm binaries are processed
without warning(and far from the resizing limits). That means that estimations are quite accurate. Still, resizing feature might be helpful for exceptional cases.

The implementation from this patch has several advantages: it is fast and lightweight, it does not require extra memory for buckets, does not require locking table or buckets for most cases, and does not require any additional memory management for the most cases.

For the task when it is not possible to estimate the size beforehand - there should be used another implementation.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125979/new/

https://reviews.llvm.org/D125979