[LLVMdev] big bitcode files producing bad ARM asm

Nick Kledzik kledzik at apple.com
Mon Apr 25 20:34:07 PDT 2011


On Apr 25, 2011, at 8:17 PM, Alexander MacDonald wrote:
> On 25 Apr 2011, at 18:20, Jakob Stoklund Olesen wrote:
>> On Apr 25, 2011, at 6:01 PM, Alexander MacDonald wrote:
>> 
>>> I have a rather large bitcode file which when run through "llc -march arm -O0" produces an asm file of about 500Mb. Trying to assemble this file with the ios assembler on osx gives me lots of "branch out of range" errors thanks to jump instructions overflowing the +/-32Mb relative jump limit.
>>> 
>>> I've tried running llc with the hidden "-arm-long-calls" option, which solves the problem but forces everything to be an indirect branch. That feels a bit like overkill, does anybody have a suggestion for what the right solution might be?
>> 
>> I don't think any other solutions are currently supported.
>> 
>> One problem is that the linker can move functions around as it pleases, so there is no way of knowing which functions are going to be far away.
> 
> But the linker will fix branches that become "long-calls" after it's shuffled things around right? so it would still be reasonable to try to get LLVM to at least codegen a single object file correctly, assuming that the codegen phase has some knowledge of roughly how big the branches will have to be when it is generating the asm, which on second thought it probably doesn't unless it knows the size of all functions before writing out the asm (I'm not too familiar with the codegen phase).

If the problem is that you have zillions of small functions (as opposed to some MB sized functions), then this is really a limitation of the mach-o object format.  The iOS linker tool will happily synthesize branch islands when the target of a bl is too far away (that is, it will add an island of code between functions that is just jump instruction.  The bl branches to the island which jumps to the final target).  

In ARM mach-o, if the target of a bl is in the same translation unit, the assembler uses a local relocation and adjusts the instruction to branch to the target address.  If the address is too far, there is no way to encode it ;-(    You might think switching to an external relocation would work, but because mach-o has no RELA relocations, the addend is encoded in the instruction.  And, unfortunately, the pc-rel instructions like bl, they are encoded as the target of the bl instruction.  So an addend of zero means the instruction looks like it branches to address zero, which means the instruction must itself within branch range of zero. So you are limited in how big your .o file can be.

You might get lucky and be able to just chop up the big assembly text file into pieces and assemble each one...

-Nick





More information about the llvm-dev mailing list