[LLVMdev] Global register variables/custom calling conventions

Andrew Jeffery andrew at aj.id.au
Sun Sep 27 01:51:55 PDT 2009

On 25/09/09 18:05, Tilmann Scheller wrote:
> Hi Andrew,
> On Wed, Sep 23, 2009 at 7:26 AM, Andrew Jeffery<andrew at aj.id.au>  wrote:
>> TCG seperates the guest (ARM) code into blocks - my front end translates
>> these to LLVM IR for LLVM to translate to x86.  The assumption is that LLVM
>> will produce a better translation than TCG*. At some future point the
>> TCG-generated native block is replaced by the LLVM's, and as such it needs
>> to hand control back to qemu in a state that it would expect from TCG.
>> Essentially the idea is to take the same input and produce the same output
>> as the original TCG block, but munge things in the middle to (hopefully) be
>> more efficient using LLVM's local optimisations.
> I'm curious, are you translating directly from ARM to LLVM IR or from
> llvm-qemu just put the pinned variables into memory (and lived with
> the performance penalty), but I agree it would be much nicer to have a
> custom calling convention in order to avoid this.

I'm translating straight from ARM to LLVM IR, avoiding TCG IR. I've 
taken the approach of implementing each basic block determined by TCG as 
a function in LLVM, which takes a pointer to the ARM CPU state struct as 
a parameter and returns the pointer to the struct. Currently I'm using 
LLVM's JIT to generate the target (x86_64) instructions, and the plan is 
to copy the instructions it generates into the translated code buffer. 
As mentioned in the previous email, the front-end has been designed to 
avoid location specific code and as such should be suitable for copying 
around in memory.

At the moment I'm not explicitly specifying a calling convention for the
functions, but LLVM seems to consistently put the ARM CPU state struct 
pointer (parameter) into %rdi. As a hack-around (since posting the 
original message to the list) I'm injecting a couple of MOV instructions 
to move the pointer from %r14 (AREG0 - TCG's pointer to the CPU state 
struct on x86_64) to %rdi before and back (from %rdi) to %r14 after the 
copied function block. Implementing a custom calling convention would 
avoid the injected MOVs, saving two instructions per block I guess. This 
isn't a huge win, and as such isn't a big priority, however it'd be a 
nice thing to have... It'd make things slightly less hackish anyway.



