[LLVMdev] First-class structs

Wed Dec 17 05:47:33 PST 2008

Hi Jon,
    1. This depends on what system you're on and what ABI you're using.
For example, on 32 bit x86 linux, a struct return is converted into a
void return and a 'hidden' first argument is added to the argument list
which is a pointer to the returned struct. On 64 bit x86 linux, the same
is true UNLESS the struct is 128 bits or less, in which case it should
be passed through 2 registers an 'eightbyte' at a time. So the
prerequisite knowledge is: what system are you targeting?
    2. At least in the AMD64 ABI they are not treated differently. In
general you can treat complex as a struct of two elements, but this may
not be in accord with the ABI on all systems. (I'm only really familiar
with AMD64).
    3. I haven't yet played with this in LLVM 2.4 yet, which introduced
structs as first class types and should make things a lot easier to work
with. However, for struct and complex returns, at least for AMD64, you
do have to do additional work generating IL to get compliance. Dale's
suggestion that you look at what llvm-gcc (use -emit-llvm option) does
for your target is basically what I did to get a grip on how to get our
compiler using LLVM to follow the AMD64 ABI.
    The number of fields in your struct may or may not be what's
important in your ABI. For example, the struct:

    struct foo{
            int32 a;
            int32 b;
   };

    is passed or returned through only ONE GPR on x86 under the AMD64
ABI while the struct

    struct foo2{
            float a;
            int32 b;
   };

    is passed or returned through only ONE XMM register on x86 under
AMD64. Struct return by value is somewhat of a pain and some ABIs make
it more painful than others.
   
    4. This depends how your ABI says to do struct return by value. For
AMD64, once the top-level struct and all its contained structs are > 128
bits, it's just returned as a pointer. So in that case there shouldn't
be any performance hit for the complexity of nesting you're using. For
ia32, it's always passed as a pointer, so again no penalty for struct
complexity (in terms of passing/returning from functions that is). So
unless you're passing structs in and out of tiny leaf functions that get
called millions of times, I wouldn't worry about the overhead too much.
It's the price of compatibility. For interfacing to LLVM 2.3, I had to
generate a bunch of extra LLVM IL to get AMD64 compliance, but by the
time it came out the back of LLVM it was actually remarkably clean
assembly.

    Hope this helps somewhat.

    -Tony


Jon Harrop wrote:
> Apologies for the dumb questions but I'm rustier than I had hoped on this.
>
> I'm trying to write a mini ML implementation and am considering trying to 
> optimize tuples into structs to avoid heap allocation when possible. Tuples 
> are often used to return multiple values in ML so I am likely to wind up 
> returning structs from functions.
>
> I also want to support as much of a C-like representation of the internal data 
> structures as possible in order to ease interoperability. This raises several 
> questions:
>
> 1. What is a function returning a struct compiled to (e.g. by GCC on Linux)?
>
> 2. What caveats are there (e.g. is complex in C99 handled differently?)?
>
> 3. If I just throw IL at LLVM naively, when is it likely to emit code that is 
> incompatible with GCC-compiled C code or barf entirely in this context (e.g. 
> are >2 fields in a returned struct on x86 not yet implemented)?
>
> 4. Will run-time performance be degraded if I make heavy use of nested structs 
> and/or return them from functions?
>
> Many thanks,
>