[PATCH] D33009: [CodeView] Add an amortized O(1) random access type visitor

Zachary Turner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 10 16:00:26 PDT 2017


zturner added a comment.

In https://reviews.llvm.org/D33009#751659, @rnk wrote:

> I think you probably want to go to the simpler strategy we described at lunch, which looks more like:
>
> - Maintain the KnownOffsets map from TypeIndex to offset, where all offsets start off unknown. You could use a sentinel value (~0U) or a separate BitVector to track if the offset is known.
> - If an offset is unknown, binary search the IndexToOffset array to find the start and end offsets of that range of records.
> - Scan that range of records and update KnownOffsets for all records in that range.
>
>   This avoids bad worst case behavior that might come up if the user scans forwards or backwards through type indices. The difference between this algorithm and what you have is that you're trying to avoid scanning the complete range, but that means you have to do scans through the BitVector to try to resume scanning where you left off. That isn't strictly necessary to get the fast random access that we're looking for.


I'm still not entirely convinced.  At lunch we discussed doing this as a way to avoid having a `BitVector` at all.  But it turns out we need the `BitVector` regardless, because the `TypeDatabase` needs to know what records are valid, and it is decoupled from the visitor.  So that aspect doesn't really buy us anything.

I think it comes down to access patterns.  90% or more of the types in a PDB are never going to be accessed.  You're looking for a symbol, it says the type of the symbol is 329082, what are the odds you're going to later access symbol 329075?  Pretty low by my estimation.  Type accesses will be sparse, and so while it will increase our cache hit ratio to scan the whole block up front, it doesn't necessarily seem like a win to me.  Most type accesses are going to come from un-examined chunks.


https://reviews.llvm.org/D33009





More information about the llvm-commits mailing list