[PATCH] ARM: Homogeneous aggregates must be allocated to contiguous registers

Oliver Stannard oliver.stannard at arm.com
Fri Mar 14 09:36:54 PDT 2014

The Problem

The AAPCS defines a homogeneous aggregate (HA) as an aggregate type containing between one and four members, all of which are of the same machine type.

It also specifies that, for the AAPCS-VFP calling convention, there are situations in which a co-processor register candidate (CPRC) should be back-filled into an unallocated register with a lower number than an already-allocated register.

It also specifies that, for the AAPCS-VFP calling convention, an HA with a base type of float, double, 64-bit vector or 128-bit vector must be allocated in a contiguous block of VFP registers, and if that is not possible it is allocated on the stack.

However, clang currently converts function arguments with struct types to multiple arguments. This means that this C code:

  struct s { float a; float b; };
  void callee(float a, double b, struct s c);

gets translated to this IR:
  define void @callee(float %a1, double %b2, float %c.0, float %c.1) #0 {

Currently, llvm will allocate `%a1` to register `s0`, `%b1` to `d1` (overlapping `s2` and `s3`), `%c.0` to `s1` (backfilling the register), and `%c.1` to `s4`. However, `%c.0` and `%c.1` are parts of the same HA, so must be allocated in a contiguous block of registers, in this example `s4` and `s5`.

There is currently some code in clang which solves some HA-related problems by inserting dummy arguments to use up registers, preventing an HA being split between registers and the stack. While it may appear that the above problem could also be solved by inserting a padding argument to use up `s1`, consider the following C function signature:

  struct s { float a; float b; };
  void callee(float a, double b, struct s c, float d);

In this case, `d` must be back-filled into `s1`, so we cannot use a padding argument to fill up `s1`.

The Solution

My solution is to move the handling of HAs from clang to the llvm calling convention code. to do this, I have created a custom allocation function which is used for all members of an HA. It stores members in a list in `CCState`, and when it sees the last member of the HA it allocates the whole lot in one go, trying registers first and then falling back to the stack.

There is a related patch to clang which prevents the expansion of a struct-typed argument into it's constituent members, which is needed for LLVM to be able to identify a HA. There are comments in clang that say that some optimisations work better with simple types than structs, but I have not done any benchmarks to find out how significant this is. Because of this, I only prevent expansion of struct arguments when the function uses the AAPCS-VFP calling convention.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D3082.1.patch
Type: text/x-patch
Size: 25877 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140314/701bcdd1/attachment.bin>

More information about the llvm-commits mailing list