[Lldb-commits] [PATCH] D75929: [DebugInfo] Support DWARFv5 index sections.

Tue Mar 17 22:41:13 PDT 2020

dblaikie added a comment.

Side note: This has become complicated enough it might be worth separating these patches now rather than later - changes to dumping should be separate from changes to llvm-dwp, for instance.

In D75929#1926834 <https://reviews.llvm.org/D75929#1926834>, @labath wrote:

> In D75929#1924373 <https://reviews.llvm.org/D75929#1924373>, @ikudrin wrote:
>
> > @dblaikie, @labath, as far as I can understand, the patch complies with your vision. The main difference is that the enum is still intended for internal use only, but it probably should not go to the public part before the proposed values are accepted. Anyway, even while the proposal of the combined index is not approved, I believe that the patch is useful per se because it allows reading standard index sections and can later be easily extended for combined indexes. The patch does not restrict that movement. Please, correct me if I misunderstand anything.
>
>
> Sorry about the delay. It took a while to get this to the top of my stack.
>
> Yes, this "complies with my/our vission", but looking at the `UnknownColumnIds` field, I am starting to have second thoughts about that vision. :( Being able to represent "unknown" columns and at the same time mapping all columns to a internal representation makes things a bit awkward.
>
> The reason this wasn't a problem for location lists is that you cannot "skip over" an unknown DW_LLE entry -- it terminates the parse right away.
>
> However, that is not the case for debug_?u_index, where you can easily ignore unknown columns. In that light, I am wondering if it wouldn't be better after all to stick to the on-disk column numbers internally (but maybe still keep the "unified" v5 enum in the public interface). @dblaikie, what do you make of that?
>
> (btw, is there a test case for the "unknown column" code path?)

I'm undecided - yeah, it is awkward having an extra data structure to store the original parsed values & then remapping them back, etc. Alternatively we could use a single 64 bit value - and store unknown values shifted up into the top 32 bits (really I sort of wish they'd just used only 1 byte for these values - seems unreasonable that we'd need 256 sections... clearly we don't need them now)? & shift them back if the values too high & report it as unknown. Sort of doing something /like/ if the DWARF spec actually had an range reserved for implementation extensions, which will hopefully be added in the future.

But I wouldn't be completely opposed to the data structure keeping things in its parsed form & printing out based on the version of the index, etc. While having an API for querying based on the v5-with-extensions identifiers, though that seems a bit awkward.

================
Comment at: llvm/include/llvm/DebugInfo/DWARF/DWARFUnitIndex.h:47-48
+// For pre-standard ones, which correspond to sections being deprecated in
+// DWARFv5, the values are chosen more or less arbitrary and a tag "_EXT_"
+// is added to the names.
+//
----------------
Probably not arbitrarily - in the sense that this is an extension that consumers/producers will need to agree to - so maybe saying that ("non-standard extension"/"proposed for standardization" or something to that effect) and/or linking to the dwarf-discuss thread to support why these values were chosen & they can't be changed arbitrarily.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnitIndex.cpp:96-100
   Version = IndexData.getU32(OffsetPtr);
+  if (Version != 2) {
+    *OffsetPtr = BeginOffset;
+    Version = IndexData.getU16(OffsetPtr);
+    if (Version != 5)
----------------
What endianness is this encoded with? If it's little endian, then a 2 byte field with 2 bytes of zero padding should be the same as reading a 4 bytes, or the other way around, right? So perhaps we could just always read it as 2 bytes then 2 bytes of padding rather than having to the version/reset/reread dance here?

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnitIndex.cpp:131

+  // Fix InfoColumnKind: in DWARFv5, type units also lay in .debug_info.dwo.
+  if (Header.Version == 5)
----------------
jhenderson wrote:
> also lay -> are
Should we be fixing it up here - or should we avoid setting it incorrectly in the first place?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75929/new/

https://reviews.llvm.org/D75929