[llvm-commits] Atom alignment

Fri Oct 19 07:16:43 PDT 2012

Nick,

I'm glad to learn this, for I thought of something else that the GNU 
assembler does which is partial fix-ups.

For instance, in the code below:

foo:
	enter $16
	...
	leave
	cmp $0, %eax
	je bar
	ret

	.size foo, . - foo

bar:
	...
	ret

	.size bar, . - bar

The assembler may choose to fix up the relocation for the conditional 
branch in foo to bar on its own and omit the PC-relative relocation in 
the resulting object, since the section layout is fixed and so is the 
distance between the branch and its target.

In this case as in the other one below, chaining atoms together would 
likely keep the original semantics in the section.

Thanks,

-- 
Evandro Menezes          Austin, TX          emenezes at codeaurora.org
Qualcomm Innovation Center, Inc is a member of the Code Aurora Forum

On 10/18/12 18:38, Nick Kledzik wrote:
>
> On Oct 18, 2012, at 3:19 PM, Evandro Menezes wrote:
>> I wonder how much of is because we're breaking up the indivisible unit in ELF, sections, into symbol atoms.  AFAIK, because we cannot rely on the assembler programmer being diligent about specifying symbol sizes, the ELF reader gobbles up everything between symbols, even if more than its indicated size.
>>
>> The reason for this gluttony is that there is code out there that an assembler accepts that should also be accepted by the linker.  For example, jump tables.  Though I don't advocate it being written this way, an assembler is typically fine with it:
>>
>> foo:
>> 	...
>> 	jmp (.L123 + %eax * 8)
>> 	ret
>>
>> 	.size foo, . - foo
>>
>> .L123:
>> 	.long bar
>> 	.long goo
>> 	.long ...
>> 	...
>> 	.long car
>> 	.long ...
>>
>> bar:
>> 	...
>> 	jmp (bar - 16 + %eax * 8)
>> 	ret
>>
>> 	.size bar, . - bar
>>
>> In this case, .L123 doesn't have a size.  Even if a compiler would be careful enough to specify the jump-table size, a sloppy assembler programmer might not be as careful.  And if an assembler doesn't complain about it and generates a valid ELF file, the linker should take it as is.
>>
>> Now, assume that foo's atom is not referenced and thus discarded. Consequently, so is .L123's atom, including the data in it that the function bar relies on and refers to only indirectly.  Then, the code in the function bar is broken.
>>
>> So, please, bear with me, I wonder if the ELF model fits neatly ion the atom model.  And, if not, how could the atom model be improved to accommodate it and perhaps other section-based file formats.
>
> The mach-o file format has the same problem as ELF, but we've been successfully using the atom model in the darwin linker for 7+ years now.
>
> The trick is to we have an opt-in directive is assembly files (.subsections_via_symbols).  The name is historic, but what it means is that the linker can assume the file follows some rules.  The compiler always follows the rules and always uses that directive.   Hand written assembly can (if the author wants) follow the rules and use the directive.
>
>  From the linker's perspective, if the directive was not used, it has to be more conservative it what it can do with the file.  In particular it adds a "follow-on" reference from each atom in a section to the next one.  The follow-on atoms constrain the layout engine that particular atoms must layout right after another.  So, if an order file is used to move one atom there may be a whole train of atoms that move with it.
>
> In the example above, the jump-table between foo and bar should be parsed by the ELF reader as an atom that has no name.
>
> -Nick
>