[llvm] r190636 - Fix PPC ABI for ByVal structs with vector members

Mon Mar 2 09:39:05 PST 2015

David Blaikie <dblaikie at gmail.com> wrote on 28.02.2015 01:45:05:

>> So the way I intended this to work is that clang specifies the
>> expected alignment in the argument save area by the *size* of
>> the array element type (i.e. aggregates that require 8-byte
>> alignment are passed as arrays of i64, and aggregates that
>> require 16-byte alignment as passed as arrays of i128).
>>
>> In LLVM we respect that request by looking at the *size* of the
>> array element type in CalculateStackSlotAlignment:
>>
>>   // Array members are always packed to their original alignment.
>>   if (Flags.isInConsecutiveRegs()) {
>>     // If the array member was split into multiple registers, the first
>>     // needs to be aligned to the size of the full type.  (Except for
>>     // ppcf128, which is only aligned as its f64 components.)
>>     if (Flags.isSplit() && OrigVT != MVT::ppcf128)
>>       Align = OrigVT.getStoreSize();
>>     else
>>       Align = ArgVT.getStoreSize();
>>   }
>>
>> The important bit is the "OrigVT.getStoreSize" here, which will
>> be 128 for an array of i128.
>>
>> This may be a bit nonintuitive, but it was the only way I found to
>> pass alignment information from clang to LLVM if I didn't want to
>> use ByVal.

> Why did you want to avoid byval? (not suggesting it's wrong, just
curious)

According to the ppc64 ABI, the first 64 bytes of the parameter area are
actually passed in register.  Therefore, it may happen that an aggregate
passed by value actually ends up fully in registers.  Now, if I were to
use "byval", the LLVM back-end would be forced to push those registers
back to memory, because for "byval", the back-end must return a *pointer*
to common code.

This is especially annoying in the ELFv2 ABI, because there may be no
space allocated by the caller at all, and thus the only way to push the
argument to the stack would be to allocate space in the callee's frame.

All in all, this may lead to less efficient code than if we don't use
byval to implement such arguments.  (We still do use byval for arguments
that must be on the stack, e.g. because they are themselves larger than
64 bytes.)

Bye,
Ulrich