[PATCH] D133715: [ADT] Add HashMappedTrie

Wed Dec 14 10:59:34 PST 2022

dblaikie added a comment.

In D133715#3993521 <https://reviews.llvm.org/D133715#3993521>, @dexonsmith wrote:

> In D133715#3993489 <https://reviews.llvm.org/D133715#3993489>, @dblaikie wrote:
>
>>> - a test support friend class that can inspect/test the trie layout without relying on stringification;
>>
>> This bit I'm not so sure about - we don't test the bucketing behavior of `DenseHashMap` so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?
>
> IMO the layout is more complex than a flat array with quadratic probing. The logic for equal and/or nearly-equal hashes is a bit subtle, and there were hard-to-reason about bugs originally. The layout-stringification was necessary to understand what was going wrong, and the tests that use it help ensure future refactorings don't get it wrong.
>
> I don't remember the bugs, but two examples of subtleties:
>
> - On an exact match you don't want to "sink" the existing entry down a level to a new sub-trie (you need to detect "exact match" before sinking). Getting this wrong will affect performance but not otherwise be user-visible.
> - The deepest sub-trie might be a different/smaller size than the others because there are only a few bits left-over, and the handling needs to be right. It's simpler to check for correct layout directly than to guess about what user-visible effects there might be for errors.
>
> I'd be a bit uneasy with the layout tests being dropped altogether.
>
> Maybe an alternative to testing the layout directly would be to add a verification member function that iterated through the data structure and ensured everything was self-consistent (else crash? else return `false`?). Then the tests could call the member function after a series of insertions that might trigger a "bad" layout.

Fair enough - if it's sufficient to have a verify operation (maybe "assertValid" - so, yeah, crash when not valid) I'd go with that, but given the argument you've made, if you think verifying the specific structure is significantly more valuable than that, I'd be OK with some private/test-friended introspection API.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133715/new/

https://reviews.llvm.org/D133715