[PATCH] D35163: [PDB] Enable NativeSession to create symbols for built-in types on demand

Tue Jul 11 13:32:37 PDT 2017

The only difference between TPI and IPI is the leaf types that can appear.
But they both start at kFirstNonSimpleIndex, and IPI records can reference
TPI records but not the other way around
On Tue, Jul 11, 2017 at 12:02 PM Adrian McCarthy via Phabricator <
reviews at reviews.llvm.org> wrote:

> amccarth marked 2 inline comments as done.
> amccarth added a comment.
>
> I've got one more suggestion from Zach to take care of, then I'll upload
> an updated patch.
>
>
>
> ================
> Comment at: llvm/include/llvm/DebugInfo/PDB/Native/NativeSession.h:42
>
> +  uint32_t findSymbolByTypeIndex(uint32_t TI);
> +
> ----------------
> zturner wrote:
> > rnk wrote:
> > > Any reason not to use codeview::TypeIndex for this parameter? Just
> trying to avoid exposing internal libraries out through this interface?
> > +1
> I tried TypeIndex originally, but there were two issues:
>
> 1.  Users of the API generally get the type indexes from symbol methods
> like getTypeId, which (like DIA), return a uint32_t.  Using a TypeIndex
> seemed to introduce conversions without much value
> 2.  There's no standard hash function for a TypeIndex, and defining one
> seemed gratuitous.
>
> I've gone ahead and switched back to TypeIndex.
>
>
> ================
> Comment at: llvm/lib/DebugInfo/PDB/Native/NativeSession.cpp:44
> +  PDB_BuiltinType Type;
> +  uint64_t Size;
> +} BuiltinTypes[] = {
> ----------------
> rnk wrote:
> > It seems unlikely that a builtin type will grow beyond 4GB.
> Sure, but, again, this was a matter of conforming to the types used by the
> already-defined API.  The getLength() method for a symbol returns a
> uint64_t, so I was trying to avoid unnecessary conversions.  I've switched
> to uint32_t.
>
>
> ================
> Comment at: llvm/lib/DebugInfo/PDB/Native/NativeSession.cpp:107-108
> +  // First see if it's already in our cache.
> +  const auto Entry = TypeIndexToSymbolId.find(Index);
> +  if (Entry != TypeIndexToSymbolId.end())
> +    return Entry->second;
> ----------------
> zturner wrote:
> > zturner wrote:
> > > I can think of a couple of issues with this function.
> > >
> > > 1. I might be misunderstanding the purpose of `findSymbolByTypeIndex`,
> but doesn't `TypeIndex` imply that this function *only* works for types?
> What about symbols, which don't have a type index?  Would those get a
> different function?
> > >
> > > 2. What about id types (e.g. types from the IPI string instead of the
> TPI stream)?  Would we eventually add a `findSymbolByIdIndex` too?
> > >
> > > 3. Why even have a map?  Why not just have a
> `std::vector<std::unique_ptr<NativeRawSymbol>> Types;` and then in the
> constructor you resize it to `Tpi.getNumTypeRecords()`, and in this
> function if it's simple you create one on the fly, otherwise you access
> `Types[TI.toArrayIndex()]`, and if it's `nullptr` you create a new one
> otherwise you return the existing one?  This would be O(1) instead of doing
> the hash computation.
> > Alternatively, maybe we don't even need a cache?
> `LazyRandomTypeCollection` already has O(1) amortized random access, as it
> is effectively already the cache.  So maybe you can create everything on
> the fly, and not even have a cache?
> 1.  No, this is specifically for looking up a type symbol by its type
> index, not the unique symbol index that the API hands out.  The motivation
> here is, for example, a record for an enum has a method called
> getUnderlyingType which returns a TypeIndex, not a unique symbol index, so
> we need to be able to find a type symbol by its type index.
>
> 2.  I'm still fuzzy on what the IPI stream contains versus the TPI.  What
> is an id type?
>
> 3.  The symbol cache, which is indexed by symbol IDs is just a vector.
>
>
> ================
> Comment at: llvm/lib/DebugInfo/PDB/Native/NativeSession.cpp:107-108
> +  // First see if it's already in our cache.
> +  const auto Entry = TypeIndexToSymbolId.find(Index);
> +  if (Entry != TypeIndexToSymbolId.end())
> +    return Entry->second;
> ----------------
> amccarth wrote:
> > zturner wrote:
> > > zturner wrote:
> > > > I can think of a couple of issues with this function.
> > > >
> > > > 1. I might be misunderstanding the purpose of
> `findSymbolByTypeIndex`, but doesn't `TypeIndex` imply that this function
> *only* works for types?  What about symbols, which don't have a type
> index?  Would those get a different function?
> > > >
> > > > 2. What about id types (e.g. types from the IPI string instead of
> the TPI stream)?  Would we eventually add a `findSymbolByIdIndex` too?
> > > >
> > > > 3. Why even have a map?  Why not just have a
> `std::vector<std::unique_ptr<NativeRawSymbol>> Types;` and then in the
> constructor you resize it to `Tpi.getNumTypeRecords()`, and in this
> function if it's simple you create one on the fly, otherwise you access
> `Types[TI.toArrayIndex()]`, and if it's `nullptr` you create a new one
> otherwise you return the existing one?  This would be O(1) instead of doing
> the hash computation.
> > > Alternatively, maybe we don't even need a cache?
> `LazyRandomTypeCollection` already has O(1) amortized random access, as it
> is effectively already the cache.  So maybe you can create everything on
> the fly, and not even have a cache?
> > 1.  No, this is specifically for looking up a type symbol by its type
> index, not the unique symbol index that the API hands out.  The motivation
> here is, for example, a record for an enum has a method called
> getUnderlyingType which returns a TypeIndex, not a unique symbol index, so
> we need to be able to find a type symbol by its type index.
> >
> > 2.  I'm still fuzzy on what the IPI stream contains versus the TPI.
> What is an id type?
> >
> > 3.  The symbol cache, which is indexed by symbol IDs is just a vector.
> Re:  LazyRandomTypeCollection.  Yes, I think that makes sense to do this
> for the user-defined types, so that seems like it would replace the need
> for caching those.  But findSymbolByTypeIndex needs to work for both
> user-defined types and built-in types.
>
> The only trick is that the cached objects are stamped with the symbol's
> unique ID, so somewhere I'm going to have to keep track of those so that
> re-visiting a type gets the original ID.
>
>
> https://reviews.llvm.org/D35163
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170711/38ca3133/attachment-0001.html>