[PATCH, PowerPC] ABI fixes / improvements for powerpc64-linux
hfinkel at anl.gov
Wed Jul 9 12:19:56 PDT 2014
----- Original Message -----
> From: "Ulrich Weigand" <Ulrich.Weigand at de.ibm.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "cfe commits" <cfe-commits at cs.uiuc.edu>
> Sent: Wednesday, July 9, 2014 2:26:37 AM
> Subject: Re: [PATCH, PowerPC] ABI fixes / improvements for powerpc64-linux
> Hal Finkel <hfinkel at anl.gov> wrote on 08.07.2014 22:01:10:
> > Why are we restricting to power-of-two element numbers? <3 x float>
> > will become a <4 x float> and end up in vector registers anyhow.
> Hmm ... I guess we are free to make that choice; there is no way to
> create a <3 x float> type with GCC, so LLVM can in fact define the
> ABI for such types as we prefer.
> My code treats those as "non-Altivec" and passes in GPRs. But
> I guess we can instead treat them as <4 x float> Altivec and pass
> in vector registers.
I think this makes sense, because the backend will widen <3 x float> into a <4 x float> anyway.
> This would probably mean that we don't need an isAltivecVectorType
> routine, and just check for getTypeSize == 128.
> > These two hunks are very similar, can they be refactored into a
> > single function?
> If we go for the size == 128 test, the hunks would simplify anyway
> since then there would be no type that needs two registers.
> > Please add some tests for non-power-of-two-number-of-elements
> > types.
> I actually have a <6 x short> test already in. I've now added a
> <3 x short> test as well.
> > > Aggregates are passed using the "byval" mechanism, so in theory
> > > either
> > > Clang or LLVM could compute the correct alignment required for an
> > > aggregate
> > > parameter. However, it seems preferable to do this in Clang,
> > > since
> > > it has
> > > the more complete type information, and Clang needs the
> > > information
> > > anyway
> > > in order to expand va_arg correctly. So this is what the
> > > following
> > > patch
> > > does. As a result, every byval parameter will now carry an
> > > explicit
> > > "align" attribute in the IR created by Clang (which I understand
> > > is
> > > recommended anyway?).
> > Yes.
> > LGTM. Please add some tests for non-power-of-two-number-of-elements
> OK. Added a test for a three-int struct. Also added a test for a
> with alignment 32, to verify it still only gets byval align 16 (see
> Any further tests you'd like to see?
> > + // We may need to skip a slot to ensure alignment.
> > + if (isAlignedParamType(Ty) && (AllocatedSlots & 1))
> > + AllocatedSlots++;
> > How do you know it will be only one? What happens if the type has
> > an
> > enhanced alignment (given by __attribute__((aligned(128))) for
> > example).
> Well, according to the ABI, no parameter is aligned at more than 16
> in the *parameter area*, even if the type has a larger alignment
> That's why there will be at most one slot to skip. (In fact, that's
> one of
> the reasons of that part of the ABI -- we don't want to waste too
> opportunities to put values in registers.)
I don't understand how this works. If the type has a larger alignment requirement, then you can either:
1. Pass a pointer to it, or
2. Make sure it is aligned in the parameter save area as requested.
3. Pass it underaligned, and then memcpy it to an aligned place in the callee's stack frame.
When you have a greater-than-16-byte alignment specified, the caller and/or callee might need to do a dynamic stack realignment to make things work, but that's already implemented. The existing ABI (http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.9.html), as far as I can tell says nothing about aggregates being restricted to 16-byte alignment.
> (Note that the prior patch also only supports 8 / 16 bytes of
> in the parameter save area.)
> Attached updated patch set:
> (See attached file: diff-clang-synthvector)
> (See attached file: diff-clang-structalign)
> (See attached file: diff-clang-structpass)
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the cfe-commits