[Lldb-commits] [lldb] Improve performance of .debug_names lookups when DW_IDX_parent attributes are used (PR #91808)

Thu May 23 10:28:23 PDT 2024

dwblaikie wrote:

Sorry I can't quite figure out who/where to reply to, so perhaps by summarizing the situation (though it's going to be verbose, and I'm going to add some other examples/complications) I can get a grip on all this/clarify where we're at.

So, history. 

* minimum spec-compliant .debug_names (DWARFv5, 6.1.1 Lookup by Name) implementation contains a name entry for each unqualified name which contains index entries for each definition DIE with that unqualified name (I'm glossing over all the pedantic language in the spec - can read that if anyone's interested)

* the original [Apple names](https://llvm.org/docs/SourceLevelDebugging.html#name-accelerator-tables) also included a hash of the qualified name to check that the exact name is found, but that relies on textually identical names, which isn't portable/wouldn't work for humans writing names in a variety of ways - so the standardized version allowed producers to include a DW_IDX_parent link up to the (hmm, I can't find any mention of the "hash of the fully qualified name" part in the llvm.org docs, so perhaps they're out of date -  yeah, it's in the code, DW_ATOM_qual_name_hash but now I just have further question (it seems this data's only generated for dsyms, not tables in .o files))

* having only unqualified names in the index with no further context requires a fair bit of DWARF deserialization to do fully or partially qualified name lookup - got to walk the whole DIE tree of the CU to find the unqualified DIEs parents to check them against the qualified name

* so @felipepiovezan and others worked on adding DW_IDX_parent support to LLVM's .debug_names emission code, so that qualified names could be compared using the parent chain - deserializing just the specifically targeted DIEs/basically allowing quick access to the parent chain of the unqualified DIE

* One complication of DW_IDX_parent implementation is how to detect the difference between an index name entry that describes an entity in the global namespace (ie: truly has no qualifying name needed to reach/identify it) and another entry that a producer didn't choose to emit DW_IDX_parent for.

* The way this was addressed was to use `DW_IDX_parent` with `DW_FORM_flag_present` to indicate that this DIE's qualified and unqualified name are identical, it has no qualifying parent

* The DWARF spec does allow unnamed entries in the index - not in the name index (because that's the list of non-empty names), but in the index entry list

* The DWARF spec does say that if a named entity's parent isn't indexed, perhaps because it's unnamed (unnamed entities don't go in the /name/ index) then a producer might reference an unnamed index entry, or might skip the entity and reference its named parent.

* The /intent/ of this wording I think is clear, to allow consumers to still do qualified name lookup directly from an index-with-parents even if there are some unnamed scopes present - but the language is too loose because there can be named entities that aren't present in the index (because they aren't definitions) that, if skipped, would make the index basically unusable - the consumer wouldn't know they were skipped and would treat those name components as not-present, so "foo::bar::baz" would appear to a consumer reading the index as "foo::baz" and a query for "foo::bar::baz" would be rejected.

I think the cases where these missing parents come up are most likely not present for Apple, the two places I can think of are the one we're currently discussing, type units - and `-fno-standalone-debug` situations.

The latter might be more informative as an example to consider?

```
struct t1 {
  virtual void f1();
  struct t2 { };
};
t1::t2 v1;
```
```
$ llvm-dwarfdump-tot missing_parent.o | grep "DW_TAG\|DW_AT_name\|DW_AT_declaration" | sed -e "s/^.{12}//"
0x0000000c: DW_TAG_compile_unit
              DW_AT_name        ("missing_parent.cpp")
0x0000001e:   DW_TAG_variable
                DW_AT_name      ("v1")
0x00000029:   DW_TAG_structure_type
                DW_AT_name      ("t1")
                DW_AT_declaration       (true)
0x0000002b:     DW_TAG_structure_type
                  DW_AT_name    ("t2")
```
```
$ llvm-dwarfdump-tot missing_parent.o -debug-names
"v1"
        Tag: DW_TAG_variable
        DW_IDX_die_offset: 0x0000001e
        DW_IDX_parent: <parent not indexed>
"t2"
        Tag: DW_TAG_structure_type
        DW_IDX_die_offset: 0x0000002b

```
In this case I think we currently can't do qualified lookup of `t1::t2` using just the index. We'd have to go and at least quick-parse the whole CU to find `t2`'s parent(s) and check them.

In the case where `t1` is defined in the same translation unit/CU (and we put `DW_IDX_parent` on the `t2` entry) we still /probably/ have to parse the `t1` DIE, because the entry offset we'd emit wouldn't tell us the name of `t1`. Unless we reverse the name entry->index entry mapping to find the name? (@felipepiovezan can you confirm how this works today?)

But the spec does allow us to put index entries that aren't named in - so we could put an index entry (not a named entry - no string offset, no hash) in for `t1` here. If my guess above is correct, that we don't reverse the name entry->index entry mapping, and instead just parse the parent DIEs directly anyway - then this unnamed entry is no worse for index performance (it does hurt index size, of course - but without having to carry the hash/etc, perhaps not too much?).

For type units, it's more complicated, maybe? But I think, again, it would be correct to put DW_IDX_parent pointing to the skeleton type as an index entry (not a named entry) if that's the right index-size/index-performance tradeoff. The client can navigate from the skeleton type to the full type to retrieve/check the name.

As for putting the real type in the type unit as the parent - I think that gets a bit complicated. A consumer could easily detect this situation occurred, so they shouldn't necessarily get confused by finding a situation where a child's parent doesn't contain the child (though I'm sure some consumers would get confused a little bit) - "oh, the child's parent is in a different unit from the child, of course the child isn't an actual child of the parent, must be some type unit/definition/declaration shenanigans going on" - but, yeah, with split DWARF/dwp situations, that's probably got more complications than benefits.

So, in conclusion:

* What bolt is doing, where it's skipping over type unit skeletons (does it also skip over declarations, as in the t1::t2 example above? I guess it probably does - that's equally problematic) or any other named-but-not-indexed entities when choosing which parent to point to is incorrect (whether the spec says it or not, it produces fairly misleading debug info - better off not putting a parent at all). Bolt should stop doing that - at minimum, omit the parent (as clang does), at maximum, emit an unnamed index entry referring to the declaration. (may require improvements to lldb to support the latter, though)

* If someone wants to improve clang to emit unnamed index entries for un-indexed parents, and measure the space/lldb-name-lookup-perf benefits, I think that'd be good/ok?

https://github.com/llvm/llvm-project/pull/91808