[LLVMdev] Global register variables/custom calling conventions

Sun Sep 27 01:51:55 PDT 2009

On 25/09/09 18:05, Tilmann Scheller wrote:
> Hi Andrew,
>
> On Wed, Sep 23, 2009 at 7:26 AM, Andrew Jeffery<andrew at aj.id.au>  wrote:
>> TCG seperates the guest (ARM) code into blocks - my front end translates
>> these to LLVM IR for LLVM to translate to x86.  The assumption is that LLVM
>> will produce a better translation than TCG*. At some future point the
>> TCG-generated native block is replaced by the LLVM's, and as such it needs
>> to hand control back to qemu in a state that it would expect from TCG.
>> Essentially the idea is to take the same input and produce the same output
>> as the original TCG block, but munge things in the middle to (hopefully) be
>> more efficient using LLVM's local optimisations.
> I'm curious, are you translating directly from ARM to LLVM IR or from
> TCG IR to LLVM IR?
>
> llvm-qemu just put the pinned variables into memory (and lived with
> the performance penalty), but I agree it would be much nicer to have a
> custom calling convention in order to avoid this.
>

I'm translating straight from ARM to LLVM IR, avoiding TCG IR. I've 
taken the approach of implementing each basic block determined by TCG as 
a function in LLVM, which takes a pointer to the ARM CPU state struct as 
a parameter and returns the pointer to the struct. Currently I'm using 
LLVM's JIT to generate the target (x86_64) instructions, and the plan is 
to copy the instructions it generates into the translated code buffer. 
As mentioned in the previous email, the front-end has been designed to 
avoid location specific code and as such should be suitable for copying 
around in memory.

At the moment I'm not explicitly specifying a calling convention for the
functions, but LLVM seems to consistently put the ARM CPU state struct 
pointer (parameter) into %rdi. As a hack-around (since posting the 
original message to the list) I'm injecting a couple of MOV instructions 
to move the pointer from %r14 (AREG0 - TCG's pointer to the CPU state 
struct on x86_64) to %rdi before and back (from %rdi) to %r14 after the 
copied function block. Implementing a custom calling convention would 
avoid the injected MOVs, saving two instructions per block I guess. This 
isn't a huge win, and as such isn't a big priority, however it'd be a 
nice thing to have... It'd make things slightly less hackish anyway.

Cheers,

Andrew