[PATCH] D56569: [ObjectYAML][yaml2obj][ELF] Add basic support for dynamic entries

Armando Montanez via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 14 16:56:25 PST 2019


amontanez added a comment.

The goal of this change is to introduce a new type of YAML section that allows the population of .dynamic entries by providing a list of tag and value pairs. These entries are interpreted (and potentially validated) before being written to the .dynamic section.

The simplest way to satisfy this requirement is for all dynamic entry values to be numeric values. However, this poses a few problems. The first problem I encountered is that if dynamic symbols are specified, the contents of .dynstr can't be manually specified or modified. This inherently prevents entries like `DT_SONAME`, `DT_NEEDED`, `DT_RPATH`, and `DT_RUNPATH` from being specified alongside dynamic symbols unless everything (all contents of .dynstr and .dynsym) is specified manually through raw contents (i.e. hex strings). The second problem is that in a given YAML file, multiple references to the same thing might not remain consistent. I had an issue where manually entering .dynamic contents via hex would often result in the annotations for the hex being inconsistent with the hex itself. A good example of this is maintaining a correct DT_STRSZ as the contents of a string table are updated.

To address these issues, this patch introduces three ways to input a value for a dynamic entry. All of these cases are illustrated later with an example.

- For strings that belong in .dynstr, the string itself can be used as the value for an entry.
- A section name can be used in place of an address. In this case, the value of the dynamic entry is the `sh_addr` of the specified section.
- A value can be specified using hexadecimal or decimal (or other bases supported by `StringRef::to_integer()`).

When writing the .dynamic section entries, yaml2elf can validate the correctness of some tags. For example, `DT_STRTAB`'s value must be consistent with the address specified in the section header of the linked section (in pseudocode, `StrtabEnt.Val == SHdrs[Dynamic.sh_link].sh_addr`). If dynamic entries have strict enough constraints, they can also be inferred and automatically added (whether or not this should happen and how to control this from the user's side is up for discussion).

Here's an example to illustrate some of this.

  !ELF
  FileHeader:
    Class:           ELFCLASS64
    Data:            ELFDATA2LSB
    Type:            ET_DYN
    Machine:         EM_X86_64
  Sections:
    - Name: .dynsym
      Type: SHT_DYNSYM
      Address: 0x1000
    - Name: .data
      Type: SHT_PROGBITS
      Flags: [ SHF_ALLOC, SHF_WRITE ]
    - Name: .dynamic
      Type: SHT_DYNAMIC
      Entries:
        - Tag: DT_SONAME
          Value: libsomething.so
        - Tag: DT_SYMTAB
          Value: .dynsym
        - Tag: DT_SYMENT
          Value: 0x18
  DynamicSymbols:
    Global:
      - Name: foo
        Type: STT_FUNC
        Section: .data
      - Name: bar
        Type: STT_OBJECT
        Section: .data

The final section is of type `SHT_DYNAMIC`, and the `Entries:` key illustrates the proposed addition. Walking through the three dynamic entries,

1. `DT_SONAME`: The value of this entry is a string that will be inserted into the dynamic string table (.dynstr) alongside the symbol names specified in DynamicSymbols. This is possible due to the nature of `.dynstr` being represented as a `StringTableBuilder`, and that `.dynamic` is linked to `.dynstr` by default. If the `.dynamic` section had been linked to a section other than `.dynstr`, the value of this entry would have to be a number (the offset of the string in the linked string table) rather than a string.

2. `DT_SYMTAB`: This tag doesn't have any unique treatment or validation specified, but it illustrates the option of using the name of a section rather than the address. This resolves to `0x1000` since `.dynsym` is declared with an address of `0x1000`. It would have been equally valid to make this entry have a value of `0x1000`, but this means that changes to `.dynsym`'s address don't need to be updated in the dynamic entry. It's also worth noting that in the case of `DT_SYMTAB` it wouldn't be difficult to infer this, meaning it would be possible to later add this entry if it was missing.

3. `DT_SYMENT`: This tag also doesn't have any unique treatment or validation specified. Since `0x18` isn't a section name, it is treated as a number. This entry could also easily be automatically inferred.

I understand that automatically adding .dynamic entries that aren't explicitly specified could easily be undesirable, but it also simplifies the process of generating an ELF binary with all the required dynamic entries. Currently I have one proposed "solution" for this: if a DT_NULL entry is specified at the end of a set of dynamic entries, no additional dynamic entries will be generated. This isn't currently implemented, but does give the user control over the behavior. This behavior would need to be very clearly documented. I'm not dead-set on having entries automatically added to the binary even if absent from the explicit list of dynamic entries, though it's something I'd certainly take advantage of when creating tests for llvm-elfabi. I'm very much open to discussion on this.

Hopefully this better explains where the goals of this change, where it currently stands, and why it was designed in this way.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56569/new/

https://reviews.llvm.org/D56569





More information about the llvm-commits mailing list