[PATCH, PowerPC] ABI fixes / improvements for powerpc64-linux
Ulrich.Weigand at de.ibm.com
Thu Jul 10 08:18:45 PDT 2014
Hal Finkel <hfinkel at anl.gov> wrote on 10.07.2014 00:08:57:
> Secondly, realigning the parameter save area *is* LLVM's current
> intended behavior; what should happen is:
> - If a function calls a function with an over-aligned byval
> argument, that should trigger (MFI->getMaxAlignment() > MF.getTarget
> ().getFrameLowering()->getStackAlignment()) to be true in
> - Of source, this probably means that the function already had an
> over-aligned local variable (although this is not certain, it could
> be passing a global or some pointee), and already needed stack
> - The overaligned byval should force the parameter save area to be
> overaligned (by padding in the local variable space)
> - The overaligned byval should be placed at an appropriately
> aligned offset within the parameter save area (PPCISelLowering
> already does this).
OK, thanks for the explanation. I certainly agree that LLVM ought
to continue to support this at the IR level, if nothing else for
the benefit of the JIT or variant-ABI frontends ...
> Seriously, however, the double-copy problem is a real problem. If a
> user puts alignas(128) on some structure/class to keep them all on
> separate cache lines, this is being done for performance. In C++, it
> is perfectly reasonable for these to be put in a container, where
> they might be passed by value to the container manipulation
> functions, for example. Forcing a double-copy, or other performance
> degradation, because of the overalignment would really be quite
> unfortunate. Now, I agree that these will normally be passed by
> const& instead of by value if the structure is actually large, but
> if the size of the structure is small, passing by value is
> reasonable (and should have the desired effect, no two will be on
> the same cache line (either because they're at different aligned
> offsets or because some are in registers)).
I still don't quite see the case for overaligned byval parameters.
Certainly, some use cases want to overalign structures to keep
instances in separate cache lines. This is intended to prevent
cache-line ping-pong when instances are accessed from different
threads. However, that is unlikely to be an issue for by-value
parameters; in fact it is *impossible* for another thread to
access a byval parameter unless its address is taken.
This brings us back to the one case I mentioned earlier, where we
do indeed have to realign-by-copy byval parameters, namely when
the address is taken. In this case, the ABI as defined does
indeed have the drawback of requiring another copy. However,
in defining an ABI you always have to balance pros and cons ...
and there would also be disadvantages of requiring large
alignments of byval parameters at the ABI level; starting with
the fact that this requires large alignment of the stack pointer
(which may not be easy to implement in all compilers), it will
waste stack space if the argument doesn't have its address taken
(reducing stack consumption is also often an important goal, also
to reduce cache pressure), and it will waste GPRs (unless we break
the 1:1 correspondence between GPRs and the first 8 stack slots --
but that would then make va_list handling more complex).
So even if we still had complete freedom in defining the ABI from
scratch, it's not clear to me that requiring large byval alignment
would be the best choice. As another data point, I'm not aware of
any ABI on other platforms, even recently defined ones, that have
that feature ...
More information about the cfe-commits