[PATCH] D55585: RFC: [LLD][COFF] Parallel GHASH generation at link-time -- NOT FOR SUBMIT

Sat May 9 08:29:11 PDT 2020

aganea marked an inline comment as done.
aganea added inline comments.

================
Comment at: llvm/trunk/include/llvm/DebugInfo/CodeView/GlobalTypeDenseMap.h:77
+          typename BucketT = GloballyHashedInfo>
+class GlobalTypeDenseMap : public DebugEpochBase {
+  template <typename T>
----------------
santagada wrote:
> aganea wrote:
> > rnk wrote:
> > > I haven't read this implementation yet, it's quite long, but broadly I'm in favor of having a custom hash table here. This is the most performance critical thing LLD does.
> > > 
> > > This seems like a good separable change, since this map isn't actually used in parallel.
> > It's mostly a copy of `DenseMap`, but without tombstones, and with a different `DenseHashInfo` API.
> > I was wondering if a insert-only `DenseMap` would be useful in other parts of LLVM/Clang?
> I know this patch is old, but I was reading it again and I don't understand how you reconstruct the key if there were a collision on the bucket while inserting... as the bucketno will not be the same bits that were stripped when the key got packed.
> 
> In other words: LookupBucketFor starts looking for an empty bucket at the extracted bits from the key, but might have probed ahead of that number by the time we insert it in the bucket. At that time it seems that any other searches for the key on the map will fail as the key computed from that bucket doesn't have the correct value were the bucketmask is.
> 
> Also bucketmask needs to be all the lower bits of key, else there is information loss there.
> 
> Or maybe I'm just very confused on how this packing of having half the key as bucketno and half in the KV value is happening.
You're right, collision is the biggest challange here. I ran with this for a while, with both the old map and this new map side-by-side, asserting if there was a divergence, and there was not. In practice, what I am proposing here increases the chances of a key collision, and thus of a hash collision in the table. However, even with very large inputs, in the range of 1 billions type records from .OBJs, I couldn't see a single collision. However, that doesn't mean it couldn't happen :-)

I wanted to do this to make reading & writing to a bucket an atomic operation. I'm not very comftable yet with this change, I think in the long run it'd be better to rely on 128-bit data per bucket (64-bit for the key and 64-bit for the index), and do two atomic operations. Which makes things a bit more challenging for writing a truly lock-free and wait-free hash table that can resize.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55585/new/

https://reviews.llvm.org/D55585