[PATCH] D33155: [DWARFv5] Support FORM_strp in the line table header

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 19 13:20:42 PDT 2017


dblaikie added a comment.

In https://reviews.llvm.org/D33155#775734, @probinson wrote:

> > Sorry, I'm still a bit fuzzy on the details/goals/end state.
>
> Yes, well.  DWARFFormValue is a bit slippery, partly because forms are a bit slippery, and partly because it seems DWARFFormValue was a convenient place to stuff miscellaneous and not entirely related queries about forms.
>
> Users of DWARFFormValue tend to want one of three things: (a) the data value described by the form, as extracted from a DataExtractor, and possibly from another section; (b) the *size* of the data value described by the form, because they want to skip over it; or (c) whether the size of a data value that *could* be described by the form is fixed-size, and if so, what size is that.  The latter case comes up when parsing an abbrev table on its own, some consumers like to know ahead of time which abbrevs describe DIEs that are fixed-size and how big those are.
>
> This gets trickier because sometimes the data value depends on external information (e.g. relocations) and sometimes the data value has a fixed size but that size depends on external information (e.g. the DWARF version or 32/64 format).  Up through DWARF v4, all of that external information was available through the relevant DWARFUnit, because forms always described data from a Unit, and so passing a DWARFUnit pointer in to various methods was enough to hide all the details about bits and pieces from the callers.
>
> Now in DWARF v5, forms become even more exciting, because sometimes the data value depends on even more external information (e.g. yet another object-file section such as the string-offset section or the address table).  And when we are looking at forms describing information in the .debug_line section, the DWARF version and 32/64 format come from the line-table header not the Unit, and the relevant relocations are .debug_line relocations not .debug_info relocations.  When we're dumping a .debug_line section in dwarfdump, really we shouldn't be relying on a Unit at all.
>
> So, I am trying to package up the information that DWARFFormValue needs in order to answer the various queries that it gets, without DWARFFormValue itself needing to understand whether it is looking at .debug_info or .debug_line (it should be given the relevant info by its callers), and without making DWARFFormValue's API look too ridiculous.


Ah, right, thanks for explaining/refreshing all that.

seems like anywhere that still has a DWARFCompileUnit in the DWARFFormValue is a possible bug waiting to happen for non-Unit uses of forms? I think it'd be good to port it all over so it's not in this hybrid/partially supported state. Perhaps the best thing would be if FormSizeHelper (hmm, wonder if there's a better name for that) had an implicit ctor from DWARFCompileUnit so DWARFFormValue could have one API in terms of FormSizeHelper only & callers can either construct the pieces they need, or have it constructed from a DWARFCompileUnit.



================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFFormValue.cpp:333
                                   uint32_t *OffsetPtr, const DWARFUnit *CU) {
-  U = CU;
+  if (CU) {
+    U = CU;
----------------
Maybe reverse this to reduce indentation/early exit:

  if (!CU)
    return extractValueImpl(Data, OffsetPtr, nullptr);
  U = CU;
  ...




================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFFormValue.cpp:334
+  if (CU) {
+    U = CU;
+    C = &CU->getContext();
----------------
The existence of this 'U' member makes it look like the dumping and other code may not work? Since the FormSizeHelper infrastructure won't be used in those cases. That seems problematic/inconsistent.


================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFFormValue.cpp:655-658
+  if (U) {
+    if (const char *Str = U->getStringExtractor().getCStr(&Offset))
+      return Str;
+  } else if (const char *Str = C->getStringExtractor().getCStr(&Offset))
----------------
This sort of fallback seems error prone or confusing. Are there cases where the first part is necessary? If so, it seems like there would be places where the second/new part is incorrect?


https://reviews.llvm.org/D33155





More information about the llvm-commits mailing list