Support dead-stripping in ELF objects

Nick Kledzik kledzik at apple.com
Mon Apr 8 15:03:34 PDT 2013


On Apr 8, 2013, at 2:42 PM, "Robinson, Paul" <Paul_Robinson at playstation.sony.com> wrote:

> Okay, just to restate the specification all in one place.
> 
> We define three new ELF section-header flags:
> 
> * SHF_ATOMIZED
> The symbols defined for this section (other than the STT_SECTION symbol)
> have the following properties.
> - The range [st_value, st_value + st_size) for each symbol does not overlap
>  the range for any other symbol.
> - For all symbols, st_value + st_size <= sh_size (the end of the section).
> - All relocations targeting this section shall resolve to values within the
>  range for some symbol.
I'm still worried this restriction may prevent some compiler optimizations.  For instance:

int attrs[26];

int attr_of_lower_char(char c) {
	return attrs[c-'a'];
}

This function assumes a character is in the range of a lowercase letter.  The compiler could 
optimize this to:
     mov reg1, &addrs  -  0x61
     load reg2, (reg1 + reg0)
Basically, merge the minus 0x61 into the address calculation, so as to not need an instruction
to subtract 0x61.  But that means a relocation that points outside (before) the target symbol.


> 
> The range for each symbol is called an "atom."  There are no relocations
> addressing any gaps between atoms.
> 
> Note in particular this does not disallow using .section+N relocations,
> as long as the target address is in the range of some atom.
> 
> (Michael, is it really required to have a symbol with st_value = 0, if
> we have all the other requirements?  Typically we would, but strictly
> speaking a "gap" at the start of the section should not be a problem?)
> 
> * SHF_SUBSECTIONS_VIA_SYMBOLS
> SHF_ATOMIZED must be set if SHF_SUBSECTIONS_VIA_SYMBOLS is set.
> In addition, the following properties hold for the entire object file.
> - No relocation uses the STT_SECTION symbol for this section.
> - If any relocation uses a symbol defined in this section, the addend
>  must be less than st_size for that symbol.
> - The range [st_value, st_value + st_size) of any symbol in the section
>  may be moved to a different relative location.
> 
> Each "atom" in this section is also called a "subsection."  Any relocation
> using the symbol for subsection must resolve to an address within the
> subsection.  There are no location dependencies between subsections, other
> than those expressed by relocations.

How does this interact with weak definitions?  That is, how do weak copies of
inline header defined functions work today with ELF?  Are all copies left in final
linked image but only one is used?  Or does the linker actually remove unused
weak functions from the middle of a section? Or are all weak definitions always
in their own section?  

Are there any restrictions on how these new flags interact with group comdat?

-Nick

> 
> * SHF_DEADSTRIP
> This entire section may be omitted from the output file, if it is dead
> (i.e., there are no references to any symbol defined in the section).
> If SHF_SUBSECTIONS_VIA_SYMBOLS is also set, individual subsections
> may be omitted from the output file, if they are dead.
> 
> 
> We define one new ELF symbol flag:
> 
> * STF_NO_DEADSTRIP
> The linker may not remove this symbol from the output file.
> This symbol flag takes precedence over the SHF_DEADSTRIP section flag.
> If the symbol defines a subsection, the subsection must be considered live.
> If the symbol does not define a subsection, the symbol's entire section
> must be considered live.
> 
> 
> How's that?
> --paulr




More information about the llvm-commits mailing list