[LLVMdev] structs get decomposed when shouldn't

Wed May 2 01:51:06 PDT 2012

On Wednesday 02 May 2012 09:12:16 Duncan Sands wrote:
> > As I can understand, LLVM is trying to decompose datatypes into smaller
> > components in some circumstances.
> 
> Can you please explain more what you are referring to here.  LLVM itself
> shouldn't be changing function parameters or return types unless the
> function has local (internal) linkage (since in that case ABI requirements
> don't matter).

This is in the backend of LLVM itself. When converting the LLVM IR to its DAG 
representation prior to selection, CodeGen asks the target to take care of 
function parameters. Unfortunately the only interface it presents for the 
target code to make that decision is a sequence of MVTs: iN, float, double, 
vNiM, vNfM. Structs are split into their component members with no indication 
that they were originally more than that.

This has affected a couple more people recently (including me):

http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-March/048203.html
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-
Mon-20120326/055577.html

If this interface could be improved, I believe clang simply apply a function 
to its QualType and produce an LLVM type which does the right thing. Without 
that improvement clang will have to use a context-sensitive model to map the 
whole sequence of arguments. 

At least, that's the ARM situation. I'm not sure Ivan's can even be solved 
without an improved interface (well, he could probably co-opt byval pointers 
too, but that's Just Wrong).

This most recent one, I'm not sure about. Whether a struct can be mapped to a 
sane sequence of iN types probably hinges on the various alignment constraints 
and whether an argument can be split between regs and memory. (If a split is 
allowed then you can probably use [N x iM] where the struct has size N*M and 
alignment M (assuming iM has alignment M), otherwise that would be wrong).

And Juhasz David wrote:
> the problem can be mitigated by using a
> pointer tagged with byval attribute and catch such an argument in a
> custom CC function.

That's the approach I've currently adopted for some of my work, but It's 
incomplete for my needs and I'm rather concerned about the performance of what 
does work: unless we reimplement mem2reg in the backend too, it introduces 
what amounts to an argument alloca with associated load/store very late on.

Tim.