[PATCH] D95591: [yaml2obj/obj2yaml] - Implement program header table as a special Chunk.

Mon Feb 1 02:08:27 PST 2021

jhenderson added a comment.

@grimar is on vacation for an extended period, so I'll take over the design discussion from him, since I worked with him on coming up with the design. Below, I summarise the key design ideas, to make sure we're all on the same page.

At some point, I think we want to change `Sections:` to `Layout:`, to avoid confusion. Due to the invasiveness of this change, I think we wanted to get the member design done first, and possibly even keep `Sections:` as an alias.

This block is now used to control (with this patch) placement of 4 different kinds of things (called "chunks" in the code): sections, fills, the section header table, and the program header table. These would be distinguished by their `Type` field. `Fill`, `SectionHeaderTable` and `ProgramHeaderTable` have special meanings. All others represent some kind of section. The Type field then drives the meaning of the rest of the chunk. All four types share the same Offset behaviour (it is automatically calculated unless explicitly specified). This means we can control placement of all components of an ELF object, except for the file header itself, which needs to be at a fixed location anyway.

As we all know, fills are for creating gaps in the output, and sections are for defining the section header properties and contents of a section. The ProgramHeaderTable chunk simply describes the program headers in the ELF, in the same way as the old ProgramHeaders key functioned before. Finally, the SectionHeaderTable chunk type is for describing that table. In particular, it provides the ability to control the order of the section headers, as well as whether some or all headers should be omitted from the table. The latter feature is useful for writing arbitrary contents in section form, without needing the data visible to tools via the section header. Note that the section header order is independent of the section data order. The `Sections:` ordering provides the data ordering. Thus a section header could appear last in the table, but the data might be first in the ELF (after the file header).

In D95591#2531495 <https://reviews.llvm.org/D95591#2531495>, @MaskRay wrote:

> If the idea is that we will have ability to create two program header tables, this seems fine. Such an ability has no use though because only the one referenced by `e_phoff` matters.
> So I am still on the fence whether there is a good syntax.

The idea is to allow control over where the program header table is in the ELF object. I don't think the goal is to allow multiple tables. However, should a need arise to do this or something similar ever, the design is easily extensible, which is the reason we actually switched to this design from the older design.

In D95591#2532708 <https://reviews.llvm.org/D95591#2532708>, @labath wrote:

> The way I see it, the root cause of the problem here is that the top item item is called "Sections". The name kind of stopped being correct back when we allowed non-section filler chunks. Now it will be even less correct. If you follow this up with (I don't know if its in the works, but it would totally make sense to me) a change to control the placement/contents of section headers, it will become amusingly (confusingly?) self-referential.

You may have missed this - the SectionHeaderTable key (which used to be independent, but now is another of the chunk kinds as described above, due to a recent change) was relatively recently added and allows us to describe the section header table. If the key is omitted, the section header table will be implicitly added at the end.

> If, instead, the top level item was called "Chunks", or "Layout", or "Contents", then it would not be weird to see program (or section) headers described inside. OTOH, it would make the (common) case of describing a section longer (by one line) as they would no longer automatically get the "section" kind.

As noted above, by using the existing Type field to recognise special chunk kinds, and then with the default behaviour being to interpret the value as some section type, I don't think we need an extra line.

I did consider whether it would be worthwhile to keep the `ProgramHeaders` tag as an option, to avoid the need to explicitly place it somewhere in the output/update all the tests, but I don't think there is much benefit for this and it causes an increase in code complexity.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95591/new/

https://reviews.llvm.org/D95591