<div dir="ltr">Hi Tim,<div><br></div><div>We found that this commit broke the parameter passing rules C.10 - C.11 in AAPCS64.</div><div><br></div><div>Below small case can help you to reproduce the problem.</div><div><br></div><div><div>#include <arm_neon.h></div><div><br></div><div>typedef int64x1_t array19[4];</div><div>typedef union {</div><div><span class="" style="white-space:pre">   </span>int64x1_t a;</div><div><span class="" style="white-space:pre">       </span>array19 b;</div><div>} union20;</div><div><br></div><div>union20 arg1;</div><div>union20 arg2;<br></div><div><br></div><div>  union20 func2(double, union20, union20);</div><div><br></div><div>int main () {</div><div>  union20 result = func2(1.0, arg1, arg2);</div><div>  return 0;</div><div>}</div></div><div><br></div><div><br></div><div>1.0 and arg1 require 5 floating-point registers for parameter passing, which makes arg2 can't fit into the rest of 3 registers. According to the parameter passing rules C.10 - C.11, the whole arg2 need to push into stack, not the part that can't fit into registers.</div><div><br></div><div>Thanks,</div><div>Kevin</div></div><div class="gmail_extra"><br><div class="gmail_quote">2014-11-28 5:02 GMT+08:00 Tim Northover <span dir="ltr"><<a href="mailto:tnorthover@apple.com" target="_blank">tnorthover@apple.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: tnorthover<br>
Date: Thu Nov 27 15:02:42 2014<br>
New Revision: 222903<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=222903&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=222903&view=rev</a><br>
Log:<br>
AArch64: treat [N x Ty] as a block during procedure calls.<br>
<br>
The AAPCS treats small structs and homogeneous floating (or vector) aggregates<br>
specially, and guarantees they either get passed as a contiguous block of<br>
registers, or prevent any future use of those registers and get passed on the<br>
stack.<br>
<br>
This concept can fit quite neatly into LLVM's own type system, mapping an HFA<br>
to [N x float] and so on, and small structs to [N x i64]. Doing so allows<br>
front-ends to emit AAPCS compliant code without having to duplicate the<br>
register counting logic.<br>
<br>
Added:<br>
    llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h<br>
    llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll<br>
Modified:<br>
    llvm/trunk/include/llvm/CodeGen/CallingConvLower.h<br>
    llvm/trunk/include/llvm/IR/DataLayout.h<br>
    llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td<br>
    llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp<br>
    llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp<br>
    llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h<br>
    llvm/trunk/lib/Target/ARM/ARMCallingConv.h<br>
    llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll<br>
<br>
Modified: llvm/trunk/include/llvm/CodeGen/CallingConvLower.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/CallingConvLower.h?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/CallingConvLower.h?rev=222903&r1=222902&r2=222903&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/include/llvm/CodeGen/CallingConvLower.h (original)<br>
+++ llvm/trunk/include/llvm/CodeGen/CallingConvLower.h Thu Nov 27 15:02:42 2014<br>
@@ -345,8 +345,13 @@ public:<br>
   /// AllocateRegBlock - Attempt to allocate a block of RegsRequired consecutive<br>
   /// registers. If this is not possible, return zero. Otherwise, return the first<br>
   /// register of the block that were allocated, marking the entire block as allocated.<br>
-  unsigned AllocateRegBlock(const uint16_t *Regs, unsigned NumRegs, unsigned RegsRequired) {<br>
-    for (unsigned StartIdx = 0; StartIdx <= NumRegs - RegsRequired; ++StartIdx) {<br>
+  unsigned AllocateRegBlock(ArrayRef<const uint16_t> Regs,<br>
+                            unsigned RegsRequired) {<br>
+    if (RegsRequired > Regs.size())<br>
+      return 0;<br>
+<br>
+    for (unsigned StartIdx = 0; StartIdx <= Regs.size() - RegsRequired;<br>
+         ++StartIdx) {<br>
       bool BlockAvailable = true;<br>
       // Check for already-allocated regs in this block<br>
       for (unsigned BlockIdx = 0; BlockIdx < RegsRequired; ++BlockIdx) {<br>
<br>
Modified: llvm/trunk/include/llvm/IR/DataLayout.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/DataLayout.h?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/DataLayout.h?rev=222903&r1=222902&r2=222903&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/include/llvm/IR/DataLayout.h (original)<br>
+++ llvm/trunk/include/llvm/IR/DataLayout.h Thu Nov 27 15:02:42 2014<br>
@@ -228,6 +228,8 @@ public:<br>
     return (StackNaturalAlign != 0) && (Align > StackNaturalAlign);<br>
   }<br>
<br>
+  unsigned getStackAlignment() const { return StackNaturalAlign; }<br>
+<br>
   bool hasMicrosoftFastStdCallMangling() const {<br>
     return ManglingMode == MM_WINCOFF;<br>
   }<br>
<br>
Added: llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h?rev=222903&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h?rev=222903&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h (added)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h Thu Nov 27 15:02:42 2014<br>
@@ -0,0 +1,136 @@<br>
+//=== AArch64CallingConv.h - Custom Calling Convention Routines -*- C++ -*-===//<br>
+//<br>
+//                     The LLVM Compiler Infrastructure<br>
+//<br>
+// This file is distributed under the University of Illinois Open Source<br>
+// License. See LICENSE.TXT for details.<br>
+//<br>
+//===----------------------------------------------------------------------===//<br>
+//<br>
+// This file contains the custom routines for the AArch64 Calling Convention<br>
+// that aren't done by tablegen.<br>
+//<br>
+//===----------------------------------------------------------------------===//<br>
+<br>
+#ifndef LLVM_LIB_TARGET_AARCH64_AARCH64CALLINGCONVENTION_H<br>
+#define LLVM_LIB_TARGET_AARCH64_AARCH64CALLINGCONVENTION_H<br>
+<br>
+#include "AArch64.h"<br>
+#include "AArch64InstrInfo.h"<br>
+#include "AArch64Subtarget.h"<br>
+#include "llvm/CodeGen/CallingConvLower.h"<br>
+#include "llvm/IR/CallingConv.h"<br>
+#include "llvm/Target/TargetInstrInfo.h"<br>
+<br>
+namespace {<br>
+using namespace llvm;<br>
+<br>
+static const uint16_t XRegList[] = {AArch64::X0, AArch64::X1, AArch64::X2,<br>
+                                    AArch64::X3, AArch64::X4, AArch64::X5,<br>
+                                    AArch64::X6, AArch64::X7};<br>
+static const uint16_t SRegList[] = {AArch64::S0, AArch64::S1, AArch64::S2,<br>
+                                    AArch64::S3, AArch64::S4, AArch64::S5,<br>
+                                    AArch64::S6, AArch64::S7};<br>
+static const uint16_t DRegList[] = {AArch64::D0, AArch64::D1, AArch64::D2,<br>
+                                    AArch64::D3, AArch64::D4, AArch64::D5,<br>
+                                    AArch64::D6, AArch64::D7};<br>
+static const uint16_t QRegList[] = {AArch64::Q0, AArch64::Q1, AArch64::Q2,<br>
+                                    AArch64::Q3, AArch64::Q4, AArch64::Q5,<br>
+                                    AArch64::Q6, AArch64::Q7};<br>
+<br>
+static bool finishStackBlock(SmallVectorImpl<CCValAssign> &PendingMembers,<br>
+                             MVT LocVT, ISD::ArgFlagsTy &ArgFlags,<br>
+                             CCState &State, unsigned SlotAlign) {<br>
+  unsigned Size = LocVT.getSizeInBits() / 8;<br>
+  unsigned StackAlign = State.getMachineFunction()<br>
+                            .getSubtarget()<br>
+                            .getDataLayout()<br>
+                            ->getStackAlignment();<br>
+  unsigned Align = std::min(ArgFlags.getOrigAlign(), StackAlign);<br>
+<br>
+  for (auto &It : PendingMembers) {<br>
+    It.convertToMem(State.AllocateStack(Size, std::max(Align, SlotAlign)));<br>
+    State.addLoc(It);<br>
+    SlotAlign = 1;<br>
+  }<br>
+<br>
+  // All pending members have now been allocated<br>
+  PendingMembers.clear();<br>
+  return true;<br>
+}<br>
+<br>
+/// The Darwin variadic PCS places anonymous arguments in 8-byte stack slots. An<br>
+/// [N x Ty] type must still be contiguous in memory though.<br>
+static bool CC_AArch64_Custom_Stack_Block(<br>
+      unsigned &ValNo, MVT &ValVT, MVT &LocVT, CCValAssign::LocInfo &LocInfo,<br>
+      ISD::ArgFlagsTy &ArgFlags, CCState &State) {<br>
+  SmallVectorImpl<CCValAssign> &PendingMembers = State.getPendingLocs();<br>
+<br>
+  // Add the argument to the list to be allocated once we know the size of the<br>
+  // block.<br>
+  PendingMembers.push_back(<br>
+      CCValAssign::getPending(ValNo, ValVT, LocVT, LocInfo));<br>
+<br>
+  if (!ArgFlags.isInConsecutiveRegsLast())<br>
+    return true;<br>
+<br>
+  return finishStackBlock(PendingMembers, LocVT, ArgFlags, State, 8);<br>
+}<br>
+<br>
+/// Given an [N x Ty] block, it should be passed in a consecutive sequence of<br>
+/// registers. If no such sequence is available, mark the rest of the registers<br>
+/// of that type as used and place the argument on the stack.<br>
+static bool CC_AArch64_Custom_Block(unsigned &ValNo, MVT &ValVT, MVT &LocVT,<br>
+                                    CCValAssign::LocInfo &LocInfo,<br>
+                                    ISD::ArgFlagsTy &ArgFlags, CCState &State) {<br>
+  // Try to allocate a contiguous block of registers, each of the correct<br>
+  // size to hold one member.<br>
+  ArrayRef<const uint16_t> RegList;<br>
+  if (LocVT.SimpleTy == MVT::i64)<br>
+    RegList = XRegList;<br>
+  else if (LocVT.SimpleTy == MVT::f32)<br>
+    RegList = SRegList;<br>
+  else if (LocVT.SimpleTy == MVT::f64)<br>
+    RegList = DRegList;<br>
+  else if (LocVT.SimpleTy == MVT::v2f64)<br>
+    RegList = QRegList;<br>
+  else {<br>
+    // Not an array we want to split up after all.<br>
+    return false;<br>
+  }<br>
+<br>
+  SmallVectorImpl<CCValAssign> &PendingMembers = State.getPendingLocs();<br>
+<br>
+  // Add the argument to the list to be allocated once we know the size of the<br>
+  // block.<br>
+  PendingMembers.push_back(<br>
+      CCValAssign::getPending(ValNo, ValVT, LocVT, LocInfo));<br>
+<br>
+  if (!ArgFlags.isInConsecutiveRegsLast())<br>
+    return true;<br>
+<br>
+  unsigned RegResult = State.AllocateRegBlock(RegList, PendingMembers.size());<br>
+  if (RegResult) {<br>
+    for (auto &It : PendingMembers) {<br>
+      It.convertToReg(RegResult);<br>
+      State.addLoc(It);<br>
+      ++RegResult;<br>
+    }<br>
+    PendingMembers.clear();<br>
+    return true;<br>
+  }<br>
+<br>
+  // Mark all regs in the class as unavailable<br>
+  for (auto Reg : RegList)<br>
+    State.AllocateReg(Reg);<br>
+<br>
+  const AArch64Subtarget &Subtarget = static_cast<const AArch64Subtarget &>(<br>
+      State.getMachineFunction().getSubtarget());<br>
+  unsigned SlotAlign = Subtarget.isTargetDarwin() ? 1 : 8;<br>
+<br>
+  return finishStackBlock(PendingMembers, LocVT, ArgFlags, State, SlotAlign);<br>
+}<br>
+<br>
+}<br>
+<br>
+#endif<br>
<br>
Modified: llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td?rev=222903&r1=222902&r2=222903&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td (original)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td Thu Nov 27 15:02:42 2014<br>
@@ -40,6 +40,8 @@ def CC_AArch64_AAPCS : CallingConv<[<br>
   // slot is 64-bit.<br>
   CCIfByVal<CCPassByVal<8, 8>>,<br>
<br>
+  CCIfConsecutiveRegs<CCCustom<"CC_AArch64_Custom_Block">>,<br>
+<br>
   // Handle i1, i8, i16, i32, i64, f32, f64 and v2f64 by passing in registers,<br>
   // up to eight each of GPR and FPR.<br>
   CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,<br>
@@ -119,6 +121,8 @@ def CC_AArch64_DarwinPCS : CallingConv<[<br>
   // slot is 64-bit.<br>
   CCIfByVal<CCPassByVal<8, 8>>,<br>
<br>
+  CCIfConsecutiveRegs<CCCustom<"CC_AArch64_Custom_Block">>,<br>
+<br>
   // Handle i1, i8, i16, i32, i64, f32, f64 and v2f64 by passing in registers,<br>
   // up to eight each of GPR and FPR.<br>
   CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,<br>
@@ -159,6 +163,8 @@ def CC_AArch64_DarwinPCS_VarArg : Callin<br>
   CCIfType<[v2f32], CCBitConvertToType<v2i32>>,<br>
   CCIfType<[v2f64, v4f32, f128], CCBitConvertToType<v2i64>>,<br>
<br>
+  CCIfConsecutiveRegs<CCCustom<"CC_AArch64_Custom_Stack_Block">>,<br>
+<br>
   // Handle all scalar types as either i64 or f64.<br>
   CCIfType<[i8, i16, i32], CCPromoteToType<i64>>,<br>
   CCIfType<[f16, f32],     CCPromoteToType<f64>>,<br>
<br>
Modified: llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp?rev=222903&r1=222902&r2=222903&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp (original)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp Thu Nov 27 15:02:42 2014<br>
@@ -14,6 +14,7 @@<br>
 //===----------------------------------------------------------------------===//<br>
<br>
 #include "AArch64.h"<br>
+#include "AArch64CallingConvention.h"<br>
 #include "AArch64Subtarget.h"<br>
 #include "AArch64TargetMachine.h"<br>
 #include "MCTargetDesc/AArch64AddressingModes.h"<br>
<br>
Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=222903&r1=222902&r2=222903&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Thu Nov 27 15:02:42 2014<br>
@@ -12,6 +12,7 @@<br>
 //===----------------------------------------------------------------------===//<br>
<br>
 #include "AArch64ISelLowering.h"<br>
+#include "AArch64CallingConvention.h"<br>
 #include "AArch64MachineFunctionInfo.h"<br>
 #include "AArch64PerfectShuffle.h"<br>
 #include "AArch64Subtarget.h"<br>
@@ -8842,3 +8843,8 @@ Value *AArch64TargetLowering::emitStoreC<br>
                 Val, Stxr->getFunctionType()->getParamType(0)),<br>
       Addr);<br>
 }<br>
+<br>
+bool AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters(<br>
+    Type *Ty, CallingConv::ID CallConv, bool isVarArg) const {<br>
+  return Ty->isArrayTy();<br>
+}<br>
<br>
Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h?rev=222903&r1=222902&r2=222903&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h (original)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h Thu Nov 27 15:02:42 2014<br>
@@ -473,6 +473,10 @@ private:<br>
<br>
   void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue> &Results,<br>
                           SelectionDAG &DAG) const override;<br>
+<br>
+  bool functionArgumentNeedsConsecutiveRegisters(Type *Ty,<br>
+                                                 CallingConv::ID CallConv,<br>
+                                                 bool isVarArg) const;<br>
 };<br>
<br>
 namespace AArch64 {<br>
<br>
Modified: llvm/trunk/lib/Target/ARM/ARMCallingConv.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCallingConv.h?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCallingConv.h?rev=222903&r1=222902&r2=222903&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/ARM/ARMCallingConv.h (original)<br>
+++ llvm/trunk/lib/Target/ARM/ARMCallingConv.h Thu Nov 27 15:02:42 2014<br>
@@ -194,20 +194,16 @@ static bool CC_ARM_AAPCS_Custom_HA(unsig<br>
<br>
     // Try to allocate a contiguous block of registers, each of the correct<br>
     // size to hold one member.<br>
-    const uint16_t *RegList;<br>
-    unsigned NumRegs;<br>
+    ArrayRef<const uint16_t> RegList;<br>
     switch (LocVT.SimpleTy) {<br>
     case MVT::f32:<br>
       RegList = SRegList;<br>
-      NumRegs = 16;<br>
       break;<br>
     case MVT::f64:<br>
       RegList = DRegList;<br>
-      NumRegs = 8;<br>
       break;<br>
     case MVT::v2f64:<br>
       RegList = QRegList;<br>
-      NumRegs = 4;<br>
       break;<br>
     default:<br>
       llvm_unreachable("Unexpected member type for HA");<br>
@@ -215,7 +211,7 @@ static bool CC_ARM_AAPCS_Custom_HA(unsig<br>
     }<br>
<br>
     unsigned RegResult =<br>
-        State.AllocateRegBlock(RegList, NumRegs, PendingHAMembers.size());<br>
+        State.AllocateRegBlock(RegList, PendingHAMembers.size());<br>
<br>
     if (RegResult) {<br>
       for (SmallVectorImpl<CCValAssign>::iterator It = PendingHAMembers.begin();<br>
<br>
Added: llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll?rev=222903&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll?rev=222903&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll (added)<br>
+++ llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll Thu Nov 27 15:02:42 2014<br>
@@ -0,0 +1,92 @@<br>
+; RUN: llc -mtriple=aarch64-apple-ios7.0 -o - %s | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-DARWINPCS<br>
+; RUN: llc -mtriple=aarch64-linux-gnu -o - %s | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AAPCS<br>
+<br>
+declare void @callee(...)<br>
+<br>
+define float @test_hfa_regs(float, [2 x float] %in) {<br>
+; CHECK-LABEL: test_hfa_regs:<br>
+; CHECK: fadd s0, s1, s2<br>
+<br>
+  %lhs = extractvalue [2 x float] %in, 0<br>
+  %rhs = extractvalue [2 x float] %in, 1<br>
+  %sum = fadd float %lhs, %rhs<br>
+  ret float %sum<br>
+}<br>
+<br>
+; Check that the array gets allocated to a contiguous block on the stack (rather<br>
+; than the default of 2 8-byte slots).<br>
+define float @test_hfa_block([7 x float], [2 x float] %in) {<br>
+; CHECK-LABEL: test_hfa_block:<br>
+; CHECK: ldp [[LHS:s[0-9]+]], [[RHS:s[0-9]+]], [sp]<br>
+; CHECK: fadd s0, [[LHS]], [[RHS]]<br>
+<br>
+  %lhs = extractvalue [2 x float] %in, 0<br>
+  %rhs = extractvalue [2 x float] %in, 1<br>
+  %sum = fadd float %lhs, %rhs<br>
+  ret float %sum<br>
+}<br>
+<br>
+; Check that an HFA prevents backfilling of VFP registers (i.e. %rhs must go on<br>
+; the stack rather than in s7).<br>
+define float @test_hfa_block_consume([7 x float], [2 x float] %in, float %rhs) {<br>
+; CHECK-LABEL: test_hfa_block_consume:<br>
+; CHECK-DAG: ldr [[LHS:s[0-9]+]], [sp]<br>
+; CHECK-DAG: ldr [[RHS:s[0-9]+]], [sp, #8]<br>
+; CHECK: fadd s0, [[LHS]], [[RHS]]<br>
+<br>
+  %lhs = extractvalue [2 x float] %in, 0<br>
+  %sum = fadd float %lhs, %rhs<br>
+  ret float %sum<br>
+}<br>
+<br>
+define float @test_hfa_stackalign([8 x float], [1 x float], [2 x float] %in) {<br>
+; CHECK-LABEL: test_hfa_stackalign:<br>
+; CHECK-AAPCS: ldp [[LHS:s[0-9]+]], [[RHS:s[0-9]+]], [sp, #8]<br>
+; CHECK-DARWINPCS: ldp [[LHS:s[0-9]+]], [[RHS:s[0-9]+]], [sp, #4]<br>
+; CHECK: fadd s0, [[LHS]], [[RHS]]<br>
+  %lhs = extractvalue [2 x float] %in, 0<br>
+  %rhs = extractvalue [2 x float] %in, 1<br>
+  %sum = fadd float %lhs, %rhs<br>
+  ret float %sum<br>
+}<br>
+<br>
+; An HFA that ends up on the stack should not have any effect on where<br>
+; integer-based arguments go.<br>
+define i64 @test_hfa_ignores_gprs([7 x float], [2 x float] %in, i64, i64 %res) {<br>
+; CHECK-LABEL: test_hfa_ignores_gprs:<br>
+; CHECK: mov x0, x1<br>
+  ret i64 %res<br>
+}<br>
+<br>
+; [2 x float] should not be promoted to double by the Darwin varargs handling,<br>
+; but should go in an 8-byte aligned slot.<br>
+define void @test_varargs_stackalign() {<br>
+; CHECK-LABEL: test_varargs_stackalign:<br>
+; CHECK-DARWINPCS: stp {{w[0-9]+}}, {{w[0-9]+}}, [sp, #16]<br>
+<br>
+  call void(...)* @callee([3 x float] undef, [2 x float] [float 1.0, float 2.0])<br>
+  ret void<br>
+}<br>
+<br>
+define i64 @test_smallstruct_block([7 x i64], [2 x i64] %in) {<br>
+; CHECK-LABEL: test_smallstruct_block:<br>
+; CHECK: ldp [[LHS:x[0-9]+]], [[RHS:x[0-9]+]], [sp]<br>
+; CHECK: add x0, [[LHS]], [[RHS]]<br>
+  %lhs = extractvalue [2 x i64] %in, 0<br>
+  %rhs = extractvalue [2 x i64] %in, 1<br>
+  %sum = add i64 %lhs, %rhs<br>
+  ret i64 %sum<br>
+}<br>
+<br>
+; Check that a small struct prevents backfilling of registers (i.e. %rhs<br>
+; must go on the stack rather than in x7).<br>
+define i64 @test_smallstruct_block_consume([7 x i64], [2 x i64] %in, i64 %rhs) {<br>
+; CHECK-LABEL: test_smallstruct_block_consume:<br>
+; CHECK-DAG: ldr [[LHS:x[0-9]+]], [sp]<br>
+; CHECK-DAG: ldr [[RHS:x[0-9]+]], [sp, #16]<br>
+; CHECK: add x0, [[LHS]], [[RHS]]<br>
+<br>
+  %lhs = extractvalue [2 x i64] %in, 0<br>
+  %sum = add i64 %lhs, %rhs<br>
+  ret i64 %sum<br>
+}<br>
<br>
Modified: llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll?rev=222903&r1=222902&r2=222903&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll (original)<br>
+++ llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll Thu Nov 27 15:02:42 2014<br>
@@ -96,7 +96,7 @@ define void @test_nospare([8 x i64], [8<br>
<br>
 ; If there are non-variadic arguments on the stack (here two i64s) then the<br>
 ; __stack field should point just past them.<br>
-define void @test_offsetstack([10 x i64], [3 x float], ...) {<br>
+define void @test_offsetstack([8 x i64], [2 x i64], [3 x float], ...) {<br>
 ; CHECK-LABEL: test_offsetstack:<br>
 ; CHECK: sub sp, sp, #80<br>
 ; CHECK: add [[STACK_TOP:x[0-9]+]], sp, #96<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">Best Regards,<div><br></div><div>Kevin Qin</div></div></div>
</div>