[llvm-dev] [RFC] Adding support for dynamic entries in yaml2obj

James Henderson via llvm-dev llvm-dev at lists.llvm.org
Fri Jan 18 02:19:29 PST 2019


Okay, that all sounds reasonable to me. If auto-adding strings to the
non-default it hard, then that's fine, I think, at least for now. We can
improve it at a later date, if necessary.

On Fri, 18 Jan 2019 at 01:16, Armando Montanez <amontanez at google.com> wrote:

> 1) Producing an error in this situation is reasonable. Dynamic symbols
> don’t produce an error when .dynstr has explicit content specified, and
> that has been a source of confusion for me before.
> 2) For the sake of simplicity, strings won’t be added to the linked
> section unless the linked section is the default .dynstr. In all other
> cases, a numeric offset value must be used to specify the location of a
> string. I’ve provided more details on this at the end of this reply.
> 3) I was initially considering this as well, and the original patch
> already does this for a few tags. If that’s generally desirable, I’m game
> to keep this in and improve control over automatic generation of tags. The
> current behavior is that user defined tags have precedence over the
> auto-generated tags.
> 4) This is already implemented in yaml2obj for all section types (though
> the symbol tables forcibly override the linked section if symbols are
> specified). As my patch stands right now, .dynamic currently links to
> .dynstr by default, but only if dynamic entries have been specified. I
> personally feel that is appropriate behavior. Manually specifying “Link: 0”
> should properly override the default in the case that dynamic entries are
> specified. Alternatively, if no dynamic entries are specified, the linked
> section defaults to 0.
>
> Pretty much every use case you’ve brought up is either already implemented
> in my patch, or wouldn’t be too difficult to add. The only major exception
> would be allowing strings (for tags like DT_SONAME) to be added to string
> tables other than .dynstr. The existing design of yaml2obj doesn’t make
> that very approachable. It’s more reasonable to attempt to search for the
> specified string in the linked section, but that would require some
> significant structural changes as well.
>
>
> On Thu, Jan 17, 2019 at 1:55 AM James Henderson <
> jh7370.2008 at my.bristol.ac.uk> wrote:
>
>> Thanks for bringing this up. Since you posted on the review, I've been
>> thinking more about the different options and overall design of yaml2obj's
>> dynamic sections (including dynstr and dynsym) and how they work from a
>> user's perspective, not least motivated by
>> https://reviews.llvm.org/D56791, where I had to hand-craft a large
>> .dynamic section, and ended up fighting with .dynsym and .dynstr too (see
>> also https://bugs.llvm.org/show_bug.cgi?id=40339, for example).
>>
>> As for the proposal, it sounds reasonable to be able to specify numeric
>> and string arguments as you propose. However, I do have some
>> questions/thoughts/points:
>>
>> 1) What should happen if an explicit .dynstr content is specified (or
>> more specifically, explicit content is specified for the linked section,
>> see point 2)? My suggestion would be that it should be an error to specify
>> string values in .dynamic and explicit content in .dynstr at the same time.
>> 2) I'd like to avoid the same issue that is present in
>> https://bugs.llvm.org/show_bug.cgi?id=40337, namely that the .dynamic
>> section auto-populates strings in .dynstr, even though it is linked against
>> some other section. Note that this other section could also be a string
>> table, so using strings in that instance would still be valid.
>> 3) It would be nice to have some mechanism to optionally auto-populate
>> the .dynamic section with DT_STRTAB, DT_STRSZ, DT_SYMTAB, etc.
>> 4) It should be possible to omit the "Link:" field of the header, to get
>> a value of 0 there, or to specify a section index, in place of a section
>> name.
>>
>> Related to point 3), I foresee several different use-cases: a) users want
>> a regular .dynamic section, with DT_STRTAB etc in, but want the values
>> auto-populated for everything. They don't want to specify any tags at all.
>> yaml2obj does the hard work of fetching the addresses and sizes
>> automatically; b) similar to a) but a user wants to be able to extend a
>> .dynamic section with other tags (e.g. DT_SONAME); c) again similar to a),
>> but a user wants to be able to override some of the auto-generated tags
>> (but not necessarily all of them); d) a user wants to completely control
>> the content with normal-looking tags, i.e. no auto-generation at all (e.g.
>> they want to create a dynamic section without DT_STRTAB); e) a user wants
>> to be able to hand-craft the content completely (e.g. to create truncated
>> tags etc) - this could probably be best done via having an explicit
>> content. I think all except e) could be accomplished by having an extra
>> attribute for dynamic sections, namely something like "DeriveTags:
>> true/false" or similar. A value of true would mean that all mandatory tags
>> are automatically generated, unless an explicit tag of the same value is
>> specified. In order to have multiple default-generated tags with the same
>> DT_* value (or none of them), this would need to be false.
>>
>> I like your idea of a missing Value being inferred. This would work well
>> with the ability to turn off auto-generated tags, and would minimise the
>> pain of having to write each normally-defaulted tag by hand in this case.
>>
>> Note: I have no idea how well my above proposal works in relation to the
>> current design of yaml2obj. My personal ideal is that a user can use
>> yaml2obj to create basically any ELF object they want, primarily for
>> testing, so being able to test corner cases (e.g. missing tags, malformed
>> sections etc) is a significant requirement.
>>
>> James
>>
>> On Wed, 16 Jan 2019 at 19:13, Armando Montanez <amontanez at google.com>
>> wrote:
>>
>>> The goal of this proposal is to introduce a new type of YAML section for
>>> yaml2obj that allows the population of ELF .dynamic entries via a list of
>>> tag and value pairs. These entries are interpreted (and potentially
>>> validated) before being written to the .dynamic section. The simplest way
>>> to satisfy this requirement is for all dynamic entry values to be numeric
>>> values. Unfortunately, this inherently prevents entries like DT_SONAME,
>>> DT_NEEDED, DT_RPATH, and DT_RUNPATH from being specified alongside dynamic
>>> symbols due to the design of yaml2obj.
>>>
>>>
>>> This proposal introduces three ways to input a value for a dynamic
>>> entry. For a given dynamic tag, one or more of these methods of setting a
>>> value may be permitted. All of these cases are illustrated later with an
>>> example.
>>>
>>>
>>> 1. For dynamic entry strings that belong in .dynstr, the string itself
>>> can be used as the value for an entry. (ex. DT_SONAME, DT_NEEDED, DT_RPATH,
>>> and DT_RUNPATH)
>>>
>>>
>>> 2. A section name can be used in place of an address. In this case, the
>>> value of the dynamic entry is the sh_addr of the specified section. (ex.
>>> DT_STRTAB, DT_SYMTAB, DT_HASH, DT_RELA, and others)
>>>
>>>
>>> 3. A value can be specified using hexadecimal or decimal (or other bases
>>> supported by `StringRef::to_integer()`). (ex. DT_STRSZ, DT_SYMENT,
>>> DT_RELAENT, and others)
>>>
>>>
>>> Here's an example to illustrate this design:
>>>
>>>
>>> !ELF
>>>
>>> FileHeader:
>>>
>>>  Class:           ELFCLASS64
>>>
>>>  Data:            ELFDATA2LSB
>>>
>>>  Type:            ET_DYN
>>>
>>>  Machine:         EM_X86_64
>>>
>>> Sections:
>>>
>>>  - Name: .dynsym
>>>
>>>    Type: SHT_DYNSYM
>>>
>>>    Address: 0x1000
>>>
>>>  - Name: .data
>>>
>>>    Type: SHT_PROGBITS
>>>
>>>    Flags: [ SHF_ALLOC, SHF_WRITE ]
>>>
>>>  - Name: .dynamic
>>>
>>>    Type: SHT_DYNAMIC
>>>
>>>    Entries:
>>>
>>>      - Tag: DT_SONAME
>>>
>>>        Value: libsomething.so
>>>
>>>      - Tag: DT_SYMTAB
>>>
>>>        Value: .dynsym
>>>
>>>      - Tag: DT_SYMENT
>>>
>>>        Value: 0x18
>>>
>>> DynamicSymbols:
>>>
>>>  Global:
>>>
>>>    - Name: foo
>>>
>>>      Type: STT_FUNC
>>>
>>>      Section: .data
>>>
>>>    - Name: bar
>>>
>>>      Type: STT_OBJECT
>>>
>>>      Section: .data
>>>
>>>
>>> The final section is of type SHT_DYNAMIC, and the "Entries" key
>>> illustrates the proposed addition. Walking through the three dynamic
>>> entries,
>>>
>>>
>>> 1. DT_SONAME: The value of this entry is a string that will be inserted
>>> into the dynamic string table (.dynstr) alongside the symbol names
>>> specified in DynamicSymbols. This is possible due to the nature of .dynstr
>>> being represented as a StringTableBuilder, and that .dynamic is linked to
>>> .dynstr by default. If the .dynamic section had been linked to a section
>>> other than .dynstr, the value of this entry would have to be a number (the
>>> offset of the string in the linked string table) rather than a string.
>>>
>>>
>>> 2. DT_SYMTAB: This tag may either be a numeric address or a valid
>>> section name, and this example illustrates the option of using the name of
>>> a section rather than the address. This resolves to 0x1000 since .dynsym is
>>> declared with an address of 0x1000. It would have been equally valid to
>>> make this entry have a value of 0x1000, but doing so would mean that
>>> changes to .dynsym's address would need to be manually updated in the
>>> dynamic entry. It's also worth noting that in the case of DT_SYMTAB it
>>> wouldn't be too difficult to infer this.
>>>
>>>
>>> 3. DT_SYMENT: This tag is restricted to only having numeric values. This
>>> entry could easily be inferred as well.
>>>
>>>
>>> Note that it doesn’t make sense for DT_SYMENT to be any sort of string,
>>> so it is restricted to only being populated with a numeric value.
>>> Similarly, it doesn’t make sense for the value of DT_SONAME to ever be
>>> interpreted as the name of a section. Though at least one input method is
>>> required for a given dynamic tag, it’s typically the case that not all
>>> three are valid. It should also be possible to specialize upon certain tags
>>> for convenience. For example, DT_PLTREL could be specialized to allow “REL”
>>> and “RELA” to be used as values rather than requiring the values be entered
>>> in hexadecimal. Evaluating the needs for every dynamic tag isn’t within the
>>> scope of this proposal, so any tag without a specialization defaults to
>>> permitting numeric values or the name of a valid section (that is later
>>> converted to an address).
>>>
>>>
>>> Some dynamic tags have strict enough constraints that they can be
>>> inferred. This limited set of dynamic tags could treat “Value” an optional
>>> field since the value can be inferred from other parts of an ELF file. This
>>> isn’t a requirement for me, though it's something I'd certainly like to
>>> have.
>>>
>>>
>>> I began working on a patch here, and it will later be updated to reflect
>>> the RFC:
>>>
>>> https://reviews.llvm.org/D56569
>>>
>>>
>>> Best,
>>>
>>> Armando
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190118/5958ff54/attachment-0001.html>


More information about the llvm-dev mailing list