[LLVMdev] Global register variables/custom calling conventions
andrew at aj.id.au
Sun Sep 27 01:51:55 PDT 2009
On 25/09/09 18:05, Tilmann Scheller wrote:
> Hi Andrew,
> On Wed, Sep 23, 2009 at 7:26 AM, Andrew Jeffery<andrew at aj.id.au> wrote:
>> TCG seperates the guest (ARM) code into blocks - my front end translates
>> these to LLVM IR for LLVM to translate to x86. The assumption is that LLVM
>> will produce a better translation than TCG*. At some future point the
>> TCG-generated native block is replaced by the LLVM's, and as such it needs
>> to hand control back to qemu in a state that it would expect from TCG.
>> Essentially the idea is to take the same input and produce the same output
>> as the original TCG block, but munge things in the middle to (hopefully) be
>> more efficient using LLVM's local optimisations.
> I'm curious, are you translating directly from ARM to LLVM IR or from
> TCG IR to LLVM IR?
> llvm-qemu just put the pinned variables into memory (and lived with
> the performance penalty), but I agree it would be much nicer to have a
> custom calling convention in order to avoid this.
I'm translating straight from ARM to LLVM IR, avoiding TCG IR. I've
taken the approach of implementing each basic block determined by TCG as
a function in LLVM, which takes a pointer to the ARM CPU state struct as
a parameter and returns the pointer to the struct. Currently I'm using
LLVM's JIT to generate the target (x86_64) instructions, and the plan is
to copy the instructions it generates into the translated code buffer.
As mentioned in the previous email, the front-end has been designed to
avoid location specific code and as such should be suitable for copying
around in memory.
At the moment I'm not explicitly specifying a calling convention for the
functions, but LLVM seems to consistently put the ARM CPU state struct
pointer (parameter) into %rdi. As a hack-around (since posting the
original message to the list) I'm injecting a couple of MOV instructions
to move the pointer from %r14 (AREG0 - TCG's pointer to the CPU state
struct on x86_64) to %rdi before and back (from %rdi) to %r14 after the
copied function block. Implementing a custom calling convention would
avoid the injected MOVs, saving two instructions per block I guess. This
isn't a huge win, and as such isn't a big priority, however it'd be a
nice thing to have... It'd make things slightly less hackish anyway.
More information about the llvm-dev