[Lldb-commits] [PATCH] D74637: Separate DIERef vs. user_id_t: m_function_scope_qualified_name_map

Mon Feb 17 01:43:37 PST 2020

labath added a subscriber: JDevlieghere.
labath added a comment.

+ @JDevlieghere for a reason why it would be nice to have cross platform "debug map" test(s).

> I have also noticed DIERef and user_id_t in fact contain the same information which can be seen in SymbolFileDWARF::GetUID.

I am afraid the situation is not that simple. Your statement is true for all "elf" forms of dwarf (regular, dwo, dwp, type units, etc.), and also for mac dSYM files, but it is not true in case of mac "debug map" scenario. In this case, a user_id_t also encodes the symbol file / module ID. You can best see this in the function doing the opposite transformation (`SymbolFileDWARF::DecodeUID`), which returns a SymbolFileDWARF in addition to a DIERef.

The macos debug map works by creating multiple modules for each .o file, and them mangling them so that they appear to come from a single module. From five miles up, this is pretty similar to what "split dwarf" does, but it has one important distinction -- in split dwarf, the dwo files can only be interpreted together with the main executable file, which contains the linked portions of the debug info (addresses, mainly), while in case of a debug map, the .o files contain fully standalone dwarf, and the "relinking" that we do is outside of the scope of the dwarf spec and works by the linker leaving breadcrumbs about what it has done in a custom format.

For this reason the debug map, and the split dwarf features were implemented on different levels inside SymbolFileDWARF, and one of them is included in DIERef, while the other isn't. The main confusing part about this is that split dwarf creates multiple SymbolFile objects (SymbolFileDWARFDwo, SymbolFileDWARFDwp, SymbolFileDWARFDwoDwp :/), even though all of these things are really a part of a single SymbolFile object that happens to be spread across multiple files  -- this is something I am working on fixing (first by removing the Dwp flavours), and why want to use your MainCU concept for split dwarf too (hopefully that would rid us of SymbolFileDWARFDwo).

The reason I am saying all of this is to illustrate why I think you can't make user_id_t and DIERef the same thing -- the former needs to be globally unique, whereas the latter is local to a single "symbol file dwarf" (with big quotes).

However, I am not sure what all of this says about this patch. In principle, I don't see a big problem with changing the type of this field. In fact, this field used to hold a DIERef until I changed that in D63322 <https://reviews.llvm.org/D63322>. However, there wasn't a strong reason for that -- I did it because it was a) convenient; b) more memory-efficient. We can change it back if it helps you in achieving your goal (and btw, thank you for doing this in small steps). It's just that currently it's not at all clear to me what that goal is. Maybe you could give a rough outline of where are you going with this. For example, how will the user_id_t decoding process look like in the end?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74637/new/

https://reviews.llvm.org/D74637