[PATCH] D83116: [DWARFYAML] Add support for referencing different abbrev tables.

Thu Aug 20 02:45:34 PDT 2020

Higuoxing added inline comments.

================
Comment at: llvm/include/llvm/ObjectYAML/DWARFEmitter.h:34
+private:
+  Data &DWARF;
+  std::map<uint64_t, uint64_t> AbbrevID2Index;
----------------
jhenderson wrote:
> Higuoxing wrote:
> > labath wrote:
> > > jhenderson wrote:
> > > > Would it make any sense to merge the `DWARFYAML::Data` class and `DWARFYAML::DWARFState` class at all?
> > > That would definitely be nice.
> > >>! @jhenderson :
> > > Would it make any sense to merge the `DWARFYAML::Data` class and `DWARFYAML::DWARFState` class at all?
> > 
> > >>! @labath :
> > > That would definitely be nice.
> > 
> > I think they can be merged. But is there any particular reason to merge them? Personally, I would like to keep the `DWARFYAML::DWARFState` class. Here are a few concerns I would like to address:
> > 
> > - If we merge the `DWARFYAML::DWARFState` and `DWARFYAML::Data` at all, we will need to add some helper variables, e.g. `AbbrevTableID2Index` and a method like `Error DWARFYAML::link()` to `DWARFYAML::Data` class to link DWARF sections. We will have to call this special method before DWARF sections being emitted. If we port DWARF support to other binary format in the future, e.g., wasm, we might forget to call this method. If we make DWARF emitters accept `DWARFYAML::DWARFState`, it's easy to figure out that we need a `DWARFState` before emitting DWARF sections.
> > 
> > - Besides, I think `DWARFYAML::Data` and `DWARFYAML::DWARFState` have different functions. `DWARFYAML::Data` is used to contain original DWARF section descriptions and `DWARFYAML::DWARFState` contains helper structs and methods which helps link DWARF sections. It might be good not to merge them so that we can easily figure out when and where are those sections get linked?
> > 
> > I can be persuaded :-)
> > I think they can be merged. But is there any particular reason to merge them?
> As things stand, it mostly seems like `DWARFState` merely gets in the way of accessing the `Data` class. It seems to me like it would be easier to use if the two were one class, as you wouldn't have to keep going through an extra function call/member access.
> 
> 
> > - If we merge the `DWARFYAML::DWARFState` and `DWARFYAML::Data` at all, we will need to add some helper variables, e.g. `AbbrevTableID2Index` and a method like `Error DWARFYAML::link()` to `DWARFYAML::Data` class to link DWARF sections. We will have to call this special method before DWARF sections being emitted. If we port DWARF support to other binary format in the future, e.g., wasm, we might forget to call this method. If we make DWARF emitters accept `DWARFYAML::DWARFState`, it's easy to figure out that we need a `DWARFState` before emitting DWARF sections.
> If, rather than do all the calculations up front (e.g. mapping tables to IDs), you put the `Data` members behind getters, you could then add the "linking" code there. For example, with the AbbrevTable ID mapping, you could have the `getAbbrevTableIndexByID` method, which is the only way of getting an abbrev table from outside the class.  The first time this is called (or optionally every time), it works out the mapping between ID(s) and table(s). This avoids the need for an explicit `link` or `finalize` method.
> 
> > - Besides, I think `DWARFYAML::Data` and `DWARFYAML::DWARFState` have different functions. `DWARFYAML::Data` is used to contain original DWARF section descriptions and `DWARFYAML::DWARFState` contains helper structs and methods which helps link DWARF sections. It might be good not to merge them so that we can easily figure out when and where are those sections get linked?
> I'm not sure I agree they have different functions. The `AbbrevID2Index` map is just a helper variable that is directly derived from the `Data` struct's members. We could get rid of it entirely, and just move the `getAbbrevTableIndexByID` method into `Data`, rewriting it to just run through the list of tables each time to find the right ID. This is obviously less efficient than instantiating a mapping up front, if done repeatedly, but I think it indicates that this extra layer is not actually doing anything distinct. Similar principles apply probably for other things, although without further concrete examples, it's hard to discuss them!
Thanks for the explanation!

================
Comment at: llvm/lib/ObjectYAML/DWARFEmitter.cpp:89
+DWARFYAML::DWARFState::DWARFState(DWARFYAML::Data &DI, Error &Err) : DWARF(DI) {
+  ErrorAsOutParameter EAO(&Err);
+
----------------
labath wrote:
> If you'd do all this work in the factory function and then just pass in a finished map to the constructor, there'd be no need for the `ErrorAsOutParameter` thingy.
`DWARFYAML::DWARFState` is merged into `DWARFYAML::Data`, so we don't need `ErrorAsOutParameter` any more.

================
Comment at: llvm/lib/ObjectYAML/DWARFEmitter.cpp:92
+  for (uint64_t I = 0; I < DI.DebugAbbrev.size(); ++I) {
+    // If the abbrev table's ID isn't specified, we use the index as its ID.
+    uint64_t ID = I;
----------------
jhenderson wrote:
> Maybe to avoid weird errors, it might make sense to disallow mixing the two methods, i.e. if one table has an explicit ID, the others all must do too. What do you think? It may not be worth it if the code is more complex, I don't know.
I think it's a little bit difficult to distinguish these 2 situations. Besides, sometimes we might want to add one compilation unit to a test case and let it reference an existing abbrev table. We don't need to mutate the whole test case to add IDs. What do think of it?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83116/new/

https://reviews.llvm.org/D83116