[LLVMdev] Pointer vs. integer: backend troubles

Wed May 6 09:18:03 PDT 2009

On May 6, 2009, at 1:58 AM, Christoph Erhardt wrote:

> Hi everyone,
>
> I am currently working on a backend for the TriCore architecture.
> Unfortunately, I have hit an issue with LLVM's internal representation
> that's giving me a bit of a headache.
>
> The problem is that LLVM assumes that a pointer is equivalent to a
> machine-word sized integer. This implies that all pointer arithmetic
> takes place in the CPU's general-purpose registers and is done with  
> the
> "regular" integer instructions.
> Unfortunately, this does not hold true for the TriCore architecture,
> which strictly differentiates between "normal" integer values and
> pointer values. The register set is split into two subsets: 16
> general-purpose registers %d0..%d15 for 32-bit integers and floats,  
> and
> 16 address registers %a0..%a15 for 32-bit pointers, with separate
> instructions. Moreover, the ABI requires that pointer arguments to  
> (and
> pointer results from) functions be passed in address registers instead
> of general-purpose registers.
>
> As LLVM internally converts all pointers to integers (in my case i32),
> there is no way for a backend to tell whether an i32 operand is really
> an integer or actually a pointer. Thus neither the instruction  
> selection
> nor the CallingConvention stuff works for me as expected.

Your architecture poses some significant challenges for LLVM. Tackling
them sounds possible, though it'll take some work.

I'm working on a patch which changes the way function arguments and
return values are lowered:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-April/021908.html
(To everyone who gave me feedback, thanks! I'm working on an updated
patch.)

The current patch doesn't solve your problem immediately, but one of the
suggestions I got was that the argument and return value records should
carry information about what the original type of the value was. This is
needed for example on targets where i64 is not a legal type, but it  
still
needs to be passed differently from two i32 values. I was originally
thinking of just including the original MVT type, but it could be  
changed
to the LLVM IR Type*, to provide even more information. I hope to find
time to post an updated version of this patch soon, though I don't
know if it'll go into LLVM 2.6 or if it'll wait for 2.7.

Beyond the ABI requirements, LLVM treats pointers and integers fairly
interchangeably in the optimizer as well as codegen. This isn't
specific to LLVM either; there are a lot of cases where integer
arithmetic is used to perfom an index calculation, so the decision of
which instructions to use depends on the context. I've seen other
compilers make these decisions around register allocation time with
a fair amount of success. This is an area which LLVM hasn't
explored much, though I know there are a few people on this list
who are working on targets with similar requirements.

>
>
> It does not seem possible to solve this problem without modifying at
> least some of the original LLVM source code. So what would be the
> easiest (and least invasive) way to achieve this?

FWIW, everyone I know working on backends that care about quality of
generated code ends up needing to do work in target-independent code.

Dan