[LLVMdev] Logic for representation of struct parameters and results

Tue Jan 28 13:41:27 PST 2014

Hi,

I know that frontends are generally responsible for lowering struct parameters and results to primitive types to ensure that they are passed/returned in the correct registers. However, I can't find any logic in the following:

***
typedef struct _tinystruct {
  short a;
  char c;
} tinystruct;

tinystruct tinyfunc(tinystruct s)
{
  return s;
}
***

This is translated by clang 3.3 for darwin/x86-64 into

***
%struct._tinystruct = type { i16, i8 }

; Function Attrs: nounwind ssp
define i32 @tinyfunc(%struct._tinystruct* byval align 4 %s) #0 {
entry:
  %retval = alloca %struct._tinystruct, align 2
  %0 = bitcast %struct._tinystruct* %retval to i8*
  %1 = bitcast %struct._tinystruct* %s to i8*
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 4, i32 2, i1 false)
  %2 = bitcast %struct._tinystruct* %retval to i32*
  %3 = load i32* %2, align 1
  ret i32 %3
}
***

Why is the parameter a "byval" while the return value is turned into an i32? Would it be wrong if our compiler also lowers the parameter into an i32? If not, could it nevertheless could cause problems in case code compiled by our compiler and compiled by clang is mixed via LTO? (suppose the tinyfunc is compiled with clang and that we import it with "declare i32 @tinyfunc (i32)")

In general, is there any documentation for how exactly frontends should lower aggregates to LLVM for the various architectures/ABIs/calling conventions? If not, are there at least any hints regarding when it is safe to assume that "byval" will do "the right thing"? We already support all calling conventions for our native code generators so I know which parameters have to be zero/sign-extended, in which registers or where on the stack they should be passed etc, but it's really not clear to me how this maps onto LLVM IR. Every time I think I've discovered some logic, I find a new exception like the above. Additionally, looking at clang's output may not even be the best approach, as it appears to still wrongly map certain things (e.g. a "struct { int a, b, c, d; }" is returned using an sret parameter instead of in rax/rdx on darwin/x86-64 when compiling with clang 3.3)

Thanks,

Jonas

PS: I realise some of the above issues may have been fixed in clang/llvm 3.4 and I'm building it right now, but my general question about mapping documentation remains.