[PATCH] D87179: Fix debug_abbrev emitter to only assign table id once
Xing GUO via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Sep 20 22:50:36 PDT 2020
Higuoxing added inline comments.
================
Comment at: llvm/tools/obj2yaml/dwarf2yaml.cpp:27-28
for (auto AbbrvDeclSet : *AbbrevSetPtr) {
+ if (AbbrvDeclSet.second.begin() == AbbrvDeclSet.second.end())
+ continue;
Y.DebugAbbrev.emplace_back();
----------------
aadsm wrote:
> jhenderson wrote:
> > aadsm wrote:
> > > jhenderson wrote:
> > > > aadsm wrote:
> > > > > jhenderson wrote:
> > > > > > Higuoxing wrote:
> > > > > > > Do we really need this check here? I think it will prevent us from dumping an empty abbrev table.
> > > > > > >
> > > > > > > ```
> > > > > > > debug_abbrev:
> > > > > > > - ID: 0
> > > > > > > Table:
> > > > > > > - Code: 1
> > > > > > > Tag: DW_TAG_compile_unit
> > > > > > > Children: DW_CHILDREN_yes
> > > > > > > Attributes:
> > > > > > > - Attribute: DW_AT_producer
> > > > > > > Form: DW_FORM_strp
> > > > > > > - ID: 1 ## This table will not be dumped.
> > > > > > > ```
> > > > > > Could we get a test case for this situation, please? Either separately or as part of this patch, I don't mind.
> > > > > That is true. I didn't realize there should always be an empty table at the end of the debug_abbrev. I'll update then.
> > > > I don't think there always needs to be an empty table at the end of debug_abbrev (IIRC, there is a null byte added at the end of the abbrev table, but that's not a table itself), but I think we want to allow a user to have one, if they want.
> > > oh, but that's exactly what will happen if the `begin() == end()` check is not there since this code optimistically creates a table before checking if it reached the last item in the iteration or not. That's the reason I added it, maybe I misunderstood @Higuoxing's comment.
> > Section 7.5.3 of the DWARF v4 standard discusses the format of the .debug_abbrev section (I haven't checked, but I believe it's the same for other standards). In particular there is a statement that says "The abbreviations for a given compilation unit end with an entry consisting of a 0 byte for the abbreviation code." An empty abbrev table would consist of just that 0 byte, and nothing else. So a .debug_abbrev section with the content `{0, 0, 0, 0}` would actually represent 4 separate empty tables.
> >
> > We want to allow a user the ability to add an empty table. The format would be:
> > ```
> > debug_abbrev:
> > - ID: 42 ## This should be an empty table, with explicit ID.
> > - Table: [] ## Also an empty table, with implicit ID.
> > - ID: 100
> > Table: [] ## Also an empty table, with explicit ID.
> > ```
> > I believe all three should produce identical output (i.e. a single 0 byte), resulting in a 3-byte debug_abbrev section.
> >
> > As far as I can tell, your patch looks good here.
> Ok, before I update this patch, I want to have agreement from both of you that the `begin() == end()` check I had before is required to avoid always having an empty table added.
>
> Basically, that if I have the following yaml:
> ```
> debug_abbrev:
> - ID: 3
> - Table:
> - Code: 1
> Tag: DW_TAG_compile_unit
> Children: DW_CHILDREN_no
> Attributes:
> - Attribute: DW_AT_low_pc
> Form: DW_FORM_addr
> - ID: 4
> - Table:
> - Code: 1
> Tag: DW_TAG_compile_unit
> Children: DW_CHILDREN_no
> Attributes:
> - Attribute: DW_AT_low_pc
> Form: DW_FORM_addr
> ```
>
> And passing it through yaml2obj | obj2yaml, should yield:
>
> ```
> debug_abbrev:
> - ID: 0
> - Table:
> - Code: 1
> Tag: DW_TAG_compile_unit
> Children: DW_CHILDREN_no
> Attributes:
> - Attribute: DW_AT_low_pc
> Form: DW_FORM_addr
> - ID: 1
> - Table:
> - Code: 1
> Tag: DW_TAG_compile_unit
> Children: DW_CHILDREN_no
> Attributes:
> - Attribute: DW_AT_low_pc
> Form: DW_FORM_addr
> ```
>
> and **not** (the next output is what this diff in its current iteration does):
>
> ```
> debug_abbrev:
> - ID: 0
> - Table:
> - Code: 1
> Tag: DW_TAG_compile_unit
> Children: DW_CHILDREN_no
> Attributes:
> - Attribute: DW_AT_low_pc
> Form: DW_FORM_addr
> - ID: 1
> - Table:
> - Code: 1
> Tag: DW_TAG_compile_unit
> Children: DW_CHILDREN_no
> Attributes:
> - Attribute: DW_AT_low_pc
> Form: DW_FORM_add
> - ID: 2
> ```
>
Currently, `obj2yaml` uses the index of the abbrev table as its ID. That is to say, the `ID` field of the abbrev table should always be the index of the table. The `AbbrevTableID` field of the compilation unit should always be the distance between the associated abbrev table and the first abbrev table.
Your patch looks good and I think we **shouldn't** add the checker `begin() == end()` to avoid dumping empty abbrev tables since the DWARF spec doesn't forbid having empty abbrev tables. Besides, the output of `obj2yaml` and the input of `yaml2obj` should be identical though the abbrev tables might have different IDs.
```
debug_abbrev:
- ID: 3 ## Empty table with explicit ID(3).
- Table: ## Non-empty table with implicit ID(1).
- Code: 1
Tag: DW_TAG_compile_unit
Children: DW_CHILDREN_no
Attributes:
- Attribute: DW_AT_low_pc
Form: DW_FORM_addr
- ID: 4 ## Empty table with explicit ID(4).
- Table: ## Non-empty table with implicit ID(3).
- Code: 1
Tag: DW_TAG_compile_unit
Children: DW_CHILDREN_no
Attributes:
- Attribute: DW_AT_low_pc
Form: DW_FORM_addr
```
after passing it through `yaml2obj | obj2yaml`, the output should be:
```
debug_abbrev:
- ID: 0 ## Empty table.
- ID: 1 ## Non-empty table with ID/Index (1).
Table:
- Code: 0x0000000000000001
Tag: DW_TAG_compile_unit
Children: DW_CHILDREN_no
Attributes:
- Attribute: DW_AT_low_pc
Form: DW_FORM_addr
- ID: 2 ## Empty table with ID/Index(2).
- ID: 3 ## Non-empty table with ID/Index(3).
Table:
- Code: 0x0000000000000001
Tag: DW_TAG_compile_unit
Children: DW_CHILDREN_no
Attributes:
- Attribute: DW_AT_low_pc
Form: DW_FORM_addr
```
This is exactly the output of `obj2yaml` after applying this patch.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D87179/new/
https://reviews.llvm.org/D87179
More information about the llvm-commits
mailing list