[Lldb-commits] [lldb] [llvm] Add support for using foreign type units in .debug_names. (PR #87740)
Greg Clayton via lldb-commits
lldb-commits at lists.llvm.org
Fri Jun 14 17:14:46 PDT 2024
clayborg wrote:
> Most of the patch is very clean, but I'm bothered by the `getForeignTUSkeletonCUOffset` function, how it opens up with a mostly redundant (the callers checks this already) call to `getForeignTUTypeSignature`, and then proceeds with a reimplementation of `getCUOffset` (sans the type unit check). I think we could design a better set of APIs for this, but I'm not sure what those is, because I'm unclear of the meaning of various combinations of DW_IDX unit entries.
If we have an entry with a `DW_IDX_type_unit`, it can be a normal type unit, or a foreign type unit, that depends on the indexes value. If it is less than the number of local type units, then it is a local type unit and there should be no `DW_IDX_comp_unit` in the entry, else it is a foreign type unit from a .dwo or .dwp file. If we have a .dwp file, we need the `DW_IDX_comp_unit` to be able to tell if the foreign type unit made it into the .dwp file, or if this is an entry that represents the a different type unit and we should ignore it. If we have no .dwp file, then all entries for foreign type units are valid and we don't need to check the compile unit it came from.
So if you just ask an entry for its `DW_IDX_comp_unit`, you will get an answer for an entry from a normal compile unit, or for a foreign type unit with a compile unit that will be used to discriminate for the .dwp file case.
So this function is named `getForeignTUSkeletonCUOffset()` to help say "this might have compile unit entry, but we only want it if this is a foreign type unit. I took me a while to understand the spec when it came to foreign type units...
> What the new function (I think) essentially does is "give me the CU associated with this entry even if the entry does not describe a DIE in this CU" (because `DW_IDX_die_offset` is relative to a type unit)". I think this API would make sense, even without needing to talk about type units. However, is it actually implementable? For single-CU indexes, that's fine, because we can kind of assume all entries are related to that CU. But what about multi-CU indexes? The only way to determine the CU there is to have an explicit attribute. Are we saying that each entry in a multi-CU index must have a DW_IDX_compile_unit (in addition to a DW_IDX_type_unit)?
Yes, but only if the `DW_IDX_type_unit` represents a foreign type unit, and we will only need the `DW_IDX_comp_unit` if we have a .dwp file. But since we don't know if the user will use a .dwp file, we always need to put the data for the originating compile uint in as a `DW_IDX_comp_unit`.
I would rather have had a `DW_IDX_skeleton_unit` to differentiate between a `DW_IDX_comp_unit` and requiring the user to know that they can't just find the compile unit, but they must find the non-skeleton compile unit, before adding the `DW_IDX_die_offset`. But we don't have that in the spec...
>
> If the answer is yes, then this function can be implemented, but then I think the current implementation of `getCUOffset` (which I interpret as "give me the CU offset **IF** this entry is relative to that CU") doesn't make sense -- because it treats entries with an explicit `DW_IDX_compile_unit` different from an implicit/missing `DW_IDX_compile_unit`.
It doesn't treat them differently. if you ask an entry for its `DW_IDX_compile_unit` and it doesn't have one, we can fall back to grabbing a CU from the CU table, but only if the CU table has only 1 entry in it.
> And in this world, we assume that `DW_IDX_type_unit` takes precedence over `DW_IDX_compile_unit` -- if both are present then `DW_IDX_die_offset` is relative to the former. And I'm not sure if we would actually want to take up space by putting both values into the index. Looking at existing producers would be interesting, but I'm not sure if there are any (lld does not merge type unit indexes right now, possibly for this very reason). Maybe one could produce something with LTO?
LTO and bolt and soon lld will produce these single tables with multiple compile units.
And we MUST have the `DW_IDX_comp_unit` for foreign type units because type units can generate differing content and one copy of the type unit will make it into the .dwp and we need to know from which .dwo file it came and only use the entries from the matching .dwo file because the offsets can be meaningless for type units with the same signature but that came from different .dwo files.
>
> OTOH, if the answer is no, then the function isn't implementable in the generic case. That doesn't mean you can't implement this feature -- which is useful for guarding against ODR violations (exacerbated by llvm's nonconforming type signature computation algorithm...). However, I think it could be implemented in a much simpler way. For example, lldb could just check if it's looking at a single-CU index, and get the CU offset that way. No extra llvm APIs would be needed.
All of these APIs get the single CU if there is only one for both the case where you only have a `DW_IDX_comp_unit` in an entry and for the case where you have a `DW_IDX_type_unit` and/or not a `DW_IDX_comp_unit` in the same entry.
So not sure what you are asking for here, but this is the best way I can represent the messy state of the .debug_names tables. Monolithic tables with multiple CUs are here now from BOLT, and will soon be from lld, so we need to be able to handle these.
https://github.com/llvm/llvm-project/pull/87740
More information about the lldb-commits
mailing list