[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

Rui Ueyama ruiu at google.com
Tue Nov 11 12:38:58 PST 2014


On Tue, Nov 11, 2014 at 12:31 PM, Nick Kledzik <kledzik at apple.com> wrote:

> On Nov 11, 2014, at 11:54 AM, Rui Ueyama <ruiu at google.com> wrote:
>
> This falls into the usual topic that whether or not we should have a
> generic map attached to an atom. You used a reference as an alternative for
> the map in this case but the basic idea is the same.
>
> Although using a reference would be practical, it still feels a hack to
> me. It's awkward at least. Why don't you add an accessor to the attribute
> you want to DefinedAtom? We'll have a few or maybe ten more member
> functions in DefinedAtom, but it's not bad -- architectures that don't need
> them are able to just not use them. And the number of attributes we want is
> limited because the number of architectures we want to support in LLD is
> not that many.
>
> If there are architecture/platform specific atom attributes, I’ve fine
> with adding more accessors to DefinedAtom.  We just need to review them to
> see if there is similar needs on multiple flavors and design names and
> values that are clear.
>
> Regarding References, the ELF flavor puts the raw ELF relocation type as
> the Reference Kind.  Mach-o does not do that.  The mach-o relocation type
> is only 4 bits.  You need to process lots of other information (including
> other bits in the reloc record, the instruction content, and perhaps a
> “paired” relocation to determine the “kind”).  So, Mach-O Reference Kind
> values are abstract and internal to the mach-o ArchHandler.  Given that,
> using a Reference Kind to track thumbness (which only ArchHander_arm cares
> about), works well.
>
> That said, the ability to handle thumb and arm within a function is
> probably over engineering.  I’d be fine with adding to DefinedAtom
> something like:
>
>   enum CodeModel {
>      // Note: all these values need word smithing
>     codeNA,
>     codeMIPS_PIC,
>     codeMIPS_micro,
>     codeMIPS_16,
>     codeARM_16,
>     codeARM_32,
>  };
>
> virtual CodeModel codeModel() { return codeNA; }
>

Yup, that looks good. That would reduce the amount of code and the
complexity, I guess. We may want to add some prefix (like "machOCodeModel")
for that kind of stuff to make it easy to identify it's used for MachO.

You made a good point that two or more architectures may have a similar or
the same need and want to share the accessors. They have to be designed
carefully and named accordingly. We can coordinate that by sending a patch
to review if it touches DefinedAtom.


> -Nick
>
>
>
> On Tue, Nov 11, 2014 at 11:19 AM, Nick Kledzik <kledzik at apple.com> wrote:
>
>> I had a similar issue with arm vs thumb in mach-o.  Each function’s
>> thumbness is marked in its symbol table entry.
>>
>> But it is even worse, a function could change encoding in the middle
>> (only hand coded assembly could do this).
>>
>> My solution was to add a new Reference Kind for mach-o which is the
>> current instruction encoding.  The offsetInAtom() is the offset where the
>> encoding kind changes.  Usually there is just one at offset zero that sets
>> the encoding for the whole function.  So determining the thumbness requires
>> scanning the References.  But it turns out in practice the scan is rarely
>> done because the result can be cached by whatever algorithm needs that info.
>>
>> -Nick
>>
>>
>> On Nov 11, 2014, at 6:50 AM, Simon Atanasyan <simon at atanasyan.com> wrote:
>> > I was too optimistic. It is possible to use the contentTypes field for
>> > handling STO_MICROMIPS and I have a working solution but the solution
>> > is really ugly. This approach has at least two the following
>> > shortcomings:
>> >
>> > 1. A MIPS ELF symbol can hold multiple STO_xxx flags stored in the
>> > st_other field (STO_MIPS_PIC, STO_MIPS_MICROMIPS, STO_MIPS_MIPS16
>> > ...). Sometimes these flags can be even combined. If we use the
>> > contentTypes field, we have to define a separate ContentType flag for
>> > each such combination. So we get a combinatorics explosion.
>> >
>> > 2. If we handle MIPS specific ContentType flags together with other
>> > flags, it is pollute the common ELF code. If we factor out the
>> > processing of MIPS specific flags, we have to duplicate code because a
>> > symbol with say STO_MICROMIPS flag should be processed (setup size,
>> > permissions etc) the same way as a regular DefinedAtom::typeCode
>> > symbol.
>> >
>> > I considered to create a map symbol name => symbol flags, fill this
>> > map while read object files, and use the map while write a linked
>> > file. But I need to handle both local and global symbols and it is
>> > possible to get symbols with the same name.
>> >
>> > It looks like the only solution (if I do not miss anything else) is to
>> > add one more filed to the DefinedAtom class to hold
>> > target/architecture specific set of flags and modify Native and YAML
>> > formats correspondingly. Interpretation of this field is completely
>> > target/architecture dependent.
>> >
>> > Any opinions?
>> >
>> > On Thu, Nov 6, 2014 at 7:09 PM, Simon Atanasyan <simon at atanasyan.com>
>> wrote:
>> >> STO_MIPS16 and STO_MICROMIPS flags denote that the symbol use a
>> >> different "compressed" instructions encoding. Both these flags can be
>> >> combined with usual "visibility" flags.
>> >>
>> >> It looks like adding new flag into the contentTypes set might solve
>> >> the problem. Thanks for the idea. I try to implement it.
>> >>
>> >> On Thu, Nov 6, 2014 at 6:52 PM, Shankar Easwaran
>> >> <shankare at codeaurora.org> wrote:
>> >>> One way to do that is to add new visibility / contentTypes (whatever
>> is
>> >>> relevant) added for each of the values st_other picks ?
>> >>>
>> >>> What are the other values st_other can take on MIPS ?
>> >>>
>> >>> On 11/6/2014 8:50 AM, Simon Atanasyan wrote:
>> >>>> On MIPS st_other field in the ELF symbols table might contain some
>> >>>> additional MIPS-specific flags besides visibility ones. These flags
>> >>>> should be copied to the output linked file. If YAML => Native
>> >>>> conversion is switched off, there is no problem. But in case of the
>> >>>> conversion we lose st_other field values.
>> >>>>
>> >>>> So I need an advice how to keep this information. Is it a good idea
>> to
>> >>>> extend YAML and Native format to store these data? Is there any
>> >>>> alternative solutions?
>> >
>> > --
>> > Simon Atanasyan
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141111/e732788e/attachment.html>


More information about the llvm-dev mailing list