[LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler

Fri Nov 1 12:32:36 PDT 2013

On Nov 1, 2013, at 12:15 PM, David Peixotto <dpeixott at codeaurora.org> wrote:

>>>>> I was thinking that without the .ltorg directive the constant pool
>>>>> would go at the end of the section.
>>>>> 
>>>> So where does the assembler place the constant pool(s) if that
>>>> directive isn't present? I was under the impression it was always
>> required.
>>> 
>>> From my understanding it is not required. I see that GCC will place it
>>> at the end of the section. I don't know if it will ever place it
>>> anywhere besides the end of the section when there is no .ltorg
>> directive.
>>> 
>>> Here is the relevant section from the gcc docs
>>> (https://sourceware.org/binutils/docs/as/ARM-Directives.html):
>>> 
>>> """
>>> .ltorg
>>> This directive causes the current contents of the literal pool to be
>>> dumped into the current section (which is assumed to be the .text
>>> section) at the current location (aligned to a word boundary). GAS
>>> maintains a separate literal pool for each section and each
>>> sub-section. The .ltorg directive will only affect the literal pool of
>>> the current section and sub-section. At the end of assembly all
>>> remaining, un-empty literal pools will automatically be dumped.
>>> """
>>> 
>> 
>> What does ARM's documentation say?
> 
> The ARM documentation says that the assembler puts the current literal pool
> at the end of every code section, where the sections are determined by the
> AREA directive or the end of the assembly. If the default literal pool will
> be out of range the programmer can use the LTORG directive to assemble the
> current literal pool immediately.

Well, they’re consistent at least, so that’s good. That’s also pretty well-defined. I was afraid there was going to be some requirement that the assembler try to analyze things and figure out where good places were (a-la the constant island pass). Glad to hear that’s not the case.

There’s still a problem for Darwin, or any other platform that use subsections-via-symbols type layout tricks, though. There’s no assembler-time way to know how far apart the atoms in the section will be at runtime, as the linker can, and will, move things around.

The quick thought would be to emit them when the next atom begins, but that’ll fall over due to the typical “.align” which precedes the next function. The constant pool for the previous function would end up being emitted after the alignment directive for the following function, which will make that next symbol potentially not sufficiently aligned. For example,
_foo:
	ldr, r1, =0x12345678
	…
	bx lr

.align 4
_bar:
	…

The result will be that _bar is not 16-byte aligned, but only 4-byte aligned, which will come as quite the surprise to the programmer.

The next thought is to detect subsections-via-symbols and require the directive if another atom is seen and there is a non-empty constant pool. That gets a bit of chicken-and-egg, though, as the subsections-via-symbols directive is typically the last line of the .s file.

We could, perhaps, always require an explicit directive for all constant pools when using subsections-via-symbols and add a diagnostic check at the end of parsing (when we’re spitting out the non-empty pools) to see if there was a subsections-via-symbols directive in there anywhere.

Anyways, the main point of all of this is to reinforce the “there be dragons here” nature of this feature. It interacts with other parts of the assembler and the underlying assumptions of the platform in interesting ways. Lots of *really* careful test cases will be necessary.