<div dir="ltr">Hi Tim,<div><br></div><div>We found that this commit broke the parameter passing rules C.10 - C.11 in AAPCS64.</div><div><br></div><div>Below small case can help you to reproduce the problem.</div><div><br></div><div><div>#include <arm_neon.h></div><div><br></div><div>typedef int64x1_t array19[4];</div><div>typedef union {</div><div><span class="" style="white-space:pre">   </span>int64x1_t a;</div><div><span class="" style="white-space:pre">       </span>array19 b;</div><div>} union20;</div><div><br></div><div>union20 arg1;</div><div>union20 arg2;<br></div><div><br></div><div>  union20 func2(double, union20, union20);</div><div><br></div><div>int main () {</div><div>  union20 result = func2(1.0, arg1, arg2);</div><div>  return 0;</div><div>}</div></div><div><br></div><div><br></div><div>1.0 and arg1 require 5 floating-point registers for parameter passing, which makes arg2 can't fit into the rest of 3 registers. According to the parameter passing rules C.10 - C.11, the whole arg2 need to push into stack, not the part that can't fit into registers.</div><div><br></div><div>Thanks,</div><div>Kevin</div></div><div class="gmail_extra"><br><div class="gmail_quote">2014-11-28 5:02 GMT+08:00 Tim Northover <span dir="ltr"><<a href="mailto:tnorthover@apple.com" target="_blank">tnorthover@apple.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: tnorthover<br>

Date: Thu Nov 27 15:02:42 2014<br>

New Revision: 222903<br>

<br>

URL: <a href="http://llvm.org/viewvc/llvm-project?rev=222903&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=222903&view=rev</a><br>

Log:<br>

AArch64: treat [N x Ty] as a block during procedure calls.<br>

<br>

The AAPCS treats small structs and homogeneous floating (or vector) aggregates<br>

specially, and guarantees they either get passed as a contiguous block of<br>

registers, or prevent any future use of those registers and get passed on the<br>

stack.<br>

<br>

This concept can fit quite neatly into LLVM's own type system, mapping an HFA<br>

to [N x float] and so on, and small structs to [N x i64]. Doing so allows<br>

front-ends to emit AAPCS compliant code without having to duplicate the<br>

register counting logic.<br>

<br>

Added:<br>

    llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h<br>

    llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll<br>

Modified:<br>

    llvm/trunk/include/llvm/CodeGen/CallingConvLower.h<br>

    llvm/trunk/include/llvm/IR/DataLayout.h<br>

    llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td<br>

    llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp<br>

    llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp<br>

    llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h<br>

    llvm/trunk/lib/Target/ARM/ARMCallingConv.h<br>

    llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll<br>

<br>

Modified: llvm/trunk/include/llvm/CodeGen/CallingConvLower.h<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/CallingConvLower.h?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/CallingConvLower.h?rev=222903&r1=222902&r2=222903&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/include/llvm/CodeGen/CallingConvLower.h (original)<br>

+++ llvm/trunk/include/llvm/CodeGen/CallingConvLower.h Thu Nov 27 15:02:42 2014<br>

@@ -345,8 +345,13 @@ public:<br>

   /// AllocateRegBlock - Attempt to allocate a block of RegsRequired consecutive<br>

   /// registers. If this is not possible, return zero. Otherwise, return the first<br>

   /// register of the block that were allocated, marking the entire block as allocated.<br>

-  unsigned AllocateRegBlock(const uint16_t *Regs, unsigned NumRegs, unsigned RegsRequired) {<br>

-    for (unsigned StartIdx = 0; StartIdx <= NumRegs - RegsRequired; ++StartIdx) {<br>

+  unsigned AllocateRegBlock(ArrayRef<const uint16_t> Regs,<br>

+                            unsigned RegsRequired) {<br>

+    if (RegsRequired > Regs.size())<br>

+      return 0;<br>

+<br>

+    for (unsigned StartIdx = 0; StartIdx <= Regs.size() - RegsRequired;<br>

+         ++StartIdx) {<br>

       bool BlockAvailable = true;<br>

       // Check for already-allocated regs in this block<br>

       for (unsigned BlockIdx = 0; BlockIdx < RegsRequired; ++BlockIdx) {<br>

<br>

Modified: llvm/trunk/include/llvm/IR/DataLayout.h<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/DataLayout.h?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/DataLayout.h?rev=222903&r1=222902&r2=222903&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/include/llvm/IR/DataLayout.h (original)<br>

+++ llvm/trunk/include/llvm/IR/DataLayout.h Thu Nov 27 15:02:42 2014<br>

@@ -228,6 +228,8 @@ public:<br>

     return (StackNaturalAlign != 0) && (Align > StackNaturalAlign);<br>

   }<br>

<br>

+  unsigned getStackAlignment() const { return StackNaturalAlign; }<br>

+<br>

   bool hasMicrosoftFastStdCallMangling() const {<br>

     return ManglingMode == MM_WINCOFF;<br>

   }<br>

<br>

Added: llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h?rev=222903&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h?rev=222903&view=auto</a><br>

==============================================================================<br>

--- llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h (added)<br>

+++ llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.h Thu Nov 27 15:02:42 2014<br>

@@ -0,0 +1,136 @@<br>

+//=== AArch64CallingConv.h - Custom Calling Convention Routines -*- C++ -*-===//<br>

+//<br>

+//                     The LLVM Compiler Infrastructure<br>

+//<br>

+// This file is distributed under the University of Illinois Open Source<br>

+// License. See LICENSE.TXT for details.<br>

+//<br>

+//===----------------------------------------------------------------------===//<br>

+//<br>

+// This file contains the custom routines for the AArch64 Calling Convention<br>

+// that aren't done by tablegen.<br>

+//<br>

+//===----------------------------------------------------------------------===//<br>

+<br>

+#ifndef LLVM_LIB_TARGET_AARCH64_AARCH64CALLINGCONVENTION_H<br>

+#define LLVM_LIB_TARGET_AARCH64_AARCH64CALLINGCONVENTION_H<br>

+<br>

+#include "AArch64.h"<br>

+#include "AArch64InstrInfo.h"<br>

+#include "AArch64Subtarget.h"<br>

+#include "llvm/CodeGen/CallingConvLower.h"<br>

+#include "llvm/IR/CallingConv.h"<br>

+#include "llvm/Target/TargetInstrInfo.h"<br>

+<br>

+namespace {<br>

+using namespace llvm;<br>

+<br>

+static const uint16_t XRegList[] = {AArch64::X0, AArch64::X1, AArch64::X2,<br>

+                                    AArch64::X3, AArch64::X4, AArch64::X5,<br>

+                                    AArch64::X6, AArch64::X7};<br>

+static const uint16_t SRegList[] = {AArch64::S0, AArch64::S1, AArch64::S2,<br>

+                                    AArch64::S3, AArch64::S4, AArch64::S5,<br>

+                                    AArch64::S6, AArch64::S7};<br>

+static const uint16_t DRegList[] = {AArch64::D0, AArch64::D1, AArch64::D2,<br>

+                                    AArch64::D3, AArch64::D4, AArch64::D5,<br>

+                                    AArch64::D6, AArch64::D7};<br>

+static const uint16_t QRegList[] = {AArch64::Q0, AArch64::Q1, AArch64::Q2,<br>

+                                    AArch64::Q3, AArch64::Q4, AArch64::Q5,<br>

+                                    AArch64::Q6, AArch64::Q7};<br>

+<br>

+static bool finishStackBlock(SmallVectorImpl<CCValAssign> &PendingMembers,<br>

+                             MVT LocVT, ISD::ArgFlagsTy &ArgFlags,<br>

+                             CCState &State, unsigned SlotAlign) {<br>

+  unsigned Size = LocVT.getSizeInBits() / 8;<br>

+  unsigned StackAlign = State.getMachineFunction()<br>

+                            .getSubtarget()<br>

+                            .getDataLayout()<br>

+                            ->getStackAlignment();<br>

+  unsigned Align = std::min(ArgFlags.getOrigAlign(), StackAlign);<br>

+<br>

+  for (auto &It : PendingMembers) {<br>

+    It.convertToMem(State.AllocateStack(Size, std::max(Align, SlotAlign)));<br>

+    State.addLoc(It);<br>

+    SlotAlign = 1;<br>

+  }<br>

+<br>

+  // All pending members have now been allocated<br>

+  PendingMembers.clear();<br>

+  return true;<br>

+}<br>

+<br>

+/// The Darwin variadic PCS places anonymous arguments in 8-byte stack slots. An<br>

+/// [N x Ty] type must still be contiguous in memory though.<br>

+static bool CC_AArch64_Custom_Stack_Block(<br>

+      unsigned &ValNo, MVT &ValVT, MVT &LocVT, CCValAssign::LocInfo &LocInfo,<br>

+      ISD::ArgFlagsTy &ArgFlags, CCState &State) {<br>

+  SmallVectorImpl<CCValAssign> &PendingMembers = State.getPendingLocs();<br>

+<br>

+  // Add the argument to the list to be allocated once we know the size of the<br>

+  // block.<br>

+  PendingMembers.push_back(<br>

+      CCValAssign::getPending(ValNo, ValVT, LocVT, LocInfo));<br>

+<br>

+  if (!ArgFlags.isInConsecutiveRegsLast())<br>

+    return true;<br>

+<br>

+  return finishStackBlock(PendingMembers, LocVT, ArgFlags, State, 8);<br>

+}<br>

+<br>

+/// Given an [N x Ty] block, it should be passed in a consecutive sequence of<br>

+/// registers. If no such sequence is available, mark the rest of the registers<br>

+/// of that type as used and place the argument on the stack.<br>

+static bool CC_AArch64_Custom_Block(unsigned &ValNo, MVT &ValVT, MVT &LocVT,<br>

+                                    CCValAssign::LocInfo &LocInfo,<br>

+                                    ISD::ArgFlagsTy &ArgFlags, CCState &State) {<br>

+  // Try to allocate a contiguous block of registers, each of the correct<br>

+  // size to hold one member.<br>

+  ArrayRef<const uint16_t> RegList;<br>

+  if (LocVT.SimpleTy == MVT::i64)<br>

+    RegList = XRegList;<br>

+  else if (LocVT.SimpleTy == MVT::f32)<br>

+    RegList = SRegList;<br>

+  else if (LocVT.SimpleTy == MVT::f64)<br>

+    RegList = DRegList;<br>

+  else if (LocVT.SimpleTy == MVT::v2f64)<br>

+    RegList = QRegList;<br>

+  else {<br>

+    // Not an array we want to split up after all.<br>

+    return false;<br>

+  }<br>

+<br>

+  SmallVectorImpl<CCValAssign> &PendingMembers = State.getPendingLocs();<br>

+<br>

+  // Add the argument to the list to be allocated once we know the size of the<br>

+  // block.<br>

+  PendingMembers.push_back(<br>

+      CCValAssign::getPending(ValNo, ValVT, LocVT, LocInfo));<br>

+<br>

+  if (!ArgFlags.isInConsecutiveRegsLast())<br>

+    return true;<br>

+<br>

+  unsigned RegResult = State.AllocateRegBlock(RegList, PendingMembers.size());<br>

+  if (RegResult) {<br>

+    for (auto &It : PendingMembers) {<br>

+      It.convertToReg(RegResult);<br>

+      State.addLoc(It);<br>

+      ++RegResult;<br>

+    }<br>

+    PendingMembers.clear();<br>

+    return true;<br>

+  }<br>

+<br>

+  // Mark all regs in the class as unavailable<br>

+  for (auto Reg : RegList)<br>

+    State.AllocateReg(Reg);<br>

+<br>

+  const AArch64Subtarget &Subtarget = static_cast<const AArch64Subtarget &>(<br>

+      State.getMachineFunction().getSubtarget());<br>

+  unsigned SlotAlign = Subtarget.isTargetDarwin() ? 1 : 8;<br>

+<br>

+  return finishStackBlock(PendingMembers, LocVT, ArgFlags, State, SlotAlign);<br>

+}<br>

+<br>

+}<br>

+<br>

+#endif<br>

<br>

Modified: llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td?rev=222903&r1=222902&r2=222903&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td (original)<br>

+++ llvm/trunk/lib/Target/AArch64/AArch64CallingConvention.td Thu Nov 27 15:02:42 2014<br>

@@ -40,6 +40,8 @@ def CC_AArch64_AAPCS : CallingConv<[<br>

   // slot is 64-bit.<br>

   CCIfByVal<CCPassByVal<8, 8>>,<br>

<br>

+  CCIfConsecutiveRegs<CCCustom<"CC_AArch64_Custom_Block">>,<br>

+<br>

   // Handle i1, i8, i16, i32, i64, f32, f64 and v2f64 by passing in registers,<br>

   // up to eight each of GPR and FPR.<br>

   CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,<br>

@@ -119,6 +121,8 @@ def CC_AArch64_DarwinPCS : CallingConv<[<br>

   // slot is 64-bit.<br>

   CCIfByVal<CCPassByVal<8, 8>>,<br>

<br>

+  CCIfConsecutiveRegs<CCCustom<"CC_AArch64_Custom_Block">>,<br>

+<br>

   // Handle i1, i8, i16, i32, i64, f32, f64 and v2f64 by passing in registers,<br>

   // up to eight each of GPR and FPR.<br>

   CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,<br>

@@ -159,6 +163,8 @@ def CC_AArch64_DarwinPCS_VarArg : Callin<br>

   CCIfType<[v2f32], CCBitConvertToType<v2i32>>,<br>

   CCIfType<[v2f64, v4f32, f128], CCBitConvertToType<v2i64>>,<br>

<br>

+  CCIfConsecutiveRegs<CCCustom<"CC_AArch64_Custom_Stack_Block">>,<br>

+<br>

   // Handle all scalar types as either i64 or f64.<br>

   CCIfType<[i8, i16, i32], CCPromoteToType<i64>>,<br>

   CCIfType<[f16, f32],     CCPromoteToType<f64>>,<br>

<br>

Modified: llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp?rev=222903&r1=222902&r2=222903&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp (original)<br>

+++ llvm/trunk/lib/Target/AArch64/AArch64FastISel.cpp Thu Nov 27 15:02:42 2014<br>

@@ -14,6 +14,7 @@<br>

 //===----------------------------------------------------------------------===//<br>

<br>

 #include "AArch64.h"<br>

+#include "AArch64CallingConvention.h"<br>

 #include "AArch64Subtarget.h"<br>

 #include "AArch64TargetMachine.h"<br>

 #include "MCTargetDesc/AArch64AddressingModes.h"<br>

<br>

Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=222903&r1=222902&r2=222903&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original)<br>

+++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Thu Nov 27 15:02:42 2014<br>

@@ -12,6 +12,7 @@<br>

 //===----------------------------------------------------------------------===//<br>

<br>

 #include "AArch64ISelLowering.h"<br>

+#include "AArch64CallingConvention.h"<br>

 #include "AArch64MachineFunctionInfo.h"<br>

 #include "AArch64PerfectShuffle.h"<br>

 #include "AArch64Subtarget.h"<br>

@@ -8842,3 +8843,8 @@ Value *AArch64TargetLowering::emitStoreC<br>

                 Val, Stxr->getFunctionType()->getParamType(0)),<br>

       Addr);<br>

 }<br>

+<br>

+bool AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters(<br>

+    Type *Ty, CallingConv::ID CallConv, bool isVarArg) const {<br>

+  return Ty->isArrayTy();<br>

+}<br>

<br>

Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h?rev=222903&r1=222902&r2=222903&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h (original)<br>

+++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h Thu Nov 27 15:02:42 2014<br>

@@ -473,6 +473,10 @@ private:<br>

<br>

   void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue> &Results,<br>

                           SelectionDAG &DAG) const override;<br>

+<br>

+  bool functionArgumentNeedsConsecutiveRegisters(Type *Ty,<br>

+                                                 CallingConv::ID CallConv,<br>

+                                                 bool isVarArg) const;<br>

 };<br>

<br>

 namespace AArch64 {<br>

<br>

Modified: llvm/trunk/lib/Target/ARM/ARMCallingConv.h<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCallingConv.h?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCallingConv.h?rev=222903&r1=222902&r2=222903&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/lib/Target/ARM/ARMCallingConv.h (original)<br>

+++ llvm/trunk/lib/Target/ARM/ARMCallingConv.h Thu Nov 27 15:02:42 2014<br>

@@ -194,20 +194,16 @@ static bool CC_ARM_AAPCS_Custom_HA(unsig<br>

<br>

     // Try to allocate a contiguous block of registers, each of the correct<br>

     // size to hold one member.<br>

-    const uint16_t *RegList;<br>

-    unsigned NumRegs;<br>

+    ArrayRef<const uint16_t> RegList;<br>

     switch (LocVT.SimpleTy) {<br>

     case MVT::f32:<br>

       RegList = SRegList;<br>

-      NumRegs = 16;<br>

       break;<br>

     case MVT::f64:<br>

       RegList = DRegList;<br>

-      NumRegs = 8;<br>

       break;<br>

     case MVT::v2f64:<br>

       RegList = QRegList;<br>

-      NumRegs = 4;<br>

       break;<br>

     default:<br>

       llvm_unreachable("Unexpected member type for HA");<br>

@@ -215,7 +211,7 @@ static bool CC_ARM_AAPCS_Custom_HA(unsig<br>

     }<br>

<br>

     unsigned RegResult =<br>

-        State.AllocateRegBlock(RegList, NumRegs, PendingHAMembers.size());<br>

+        State.AllocateRegBlock(RegList, PendingHAMembers.size());<br>

<br>

     if (RegResult) {<br>

       for (SmallVectorImpl<CCValAssign>::iterator It = PendingHAMembers.begin();<br>

<br>

Added: llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll?rev=222903&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll?rev=222903&view=auto</a><br>

==============================================================================<br>

--- llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll (added)<br>

+++ llvm/trunk/test/CodeGen/AArch64/argument-blocks.ll Thu Nov 27 15:02:42 2014<br>

@@ -0,0 +1,92 @@<br>

+; RUN: llc -mtriple=aarch64-apple-ios7.0 -o - %s | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-DARWINPCS<br>

+; RUN: llc -mtriple=aarch64-linux-gnu -o - %s | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AAPCS<br>

+<br>

+declare void @callee(...)<br>

+<br>

+define float @test_hfa_regs(float, [2 x float] %in) {<br>

+; CHECK-LABEL: test_hfa_regs:<br>

+; CHECK: fadd s0, s1, s2<br>

+<br>

+  %lhs = extractvalue [2 x float] %in, 0<br>

+  %rhs = extractvalue [2 x float] %in, 1<br>

+  %sum = fadd float %lhs, %rhs<br>

+  ret float %sum<br>

+}<br>

+<br>

+; Check that the array gets allocated to a contiguous block on the stack (rather<br>

+; than the default of 2 8-byte slots).<br>

+define float @test_hfa_block([7 x float], [2 x float] %in) {<br>

+; CHECK-LABEL: test_hfa_block:<br>

+; CHECK: ldp [[LHS:s[0-9]+]], [[RHS:s[0-9]+]], [sp]<br>

+; CHECK: fadd s0, [[LHS]], [[RHS]]<br>

+<br>

+  %lhs = extractvalue [2 x float] %in, 0<br>

+  %rhs = extractvalue [2 x float] %in, 1<br>

+  %sum = fadd float %lhs, %rhs<br>

+  ret float %sum<br>

+}<br>

+<br>

+; Check that an HFA prevents backfilling of VFP registers (i.e. %rhs must go on<br>

+; the stack rather than in s7).<br>

+define float @test_hfa_block_consume([7 x float], [2 x float] %in, float %rhs) {<br>

+; CHECK-LABEL: test_hfa_block_consume:<br>

+; CHECK-DAG: ldr [[LHS:s[0-9]+]], [sp]<br>

+; CHECK-DAG: ldr [[RHS:s[0-9]+]], [sp, #8]<br>

+; CHECK: fadd s0, [[LHS]], [[RHS]]<br>

+<br>

+  %lhs = extractvalue [2 x float] %in, 0<br>

+  %sum = fadd float %lhs, %rhs<br>

+  ret float %sum<br>

+}<br>

+<br>

+define float @test_hfa_stackalign([8 x float], [1 x float], [2 x float] %in) {<br>

+; CHECK-LABEL: test_hfa_stackalign:<br>

+; CHECK-AAPCS: ldp [[LHS:s[0-9]+]], [[RHS:s[0-9]+]], [sp, #8]<br>

+; CHECK-DARWINPCS: ldp [[LHS:s[0-9]+]], [[RHS:s[0-9]+]], [sp, #4]<br>

+; CHECK: fadd s0, [[LHS]], [[RHS]]<br>

+  %lhs = extractvalue [2 x float] %in, 0<br>

+  %rhs = extractvalue [2 x float] %in, 1<br>

+  %sum = fadd float %lhs, %rhs<br>

+  ret float %sum<br>

+}<br>

+<br>

+; An HFA that ends up on the stack should not have any effect on where<br>

+; integer-based arguments go.<br>

+define i64 @test_hfa_ignores_gprs([7 x float], [2 x float] %in, i64, i64 %res) {<br>

+; CHECK-LABEL: test_hfa_ignores_gprs:<br>

+; CHECK: mov x0, x1<br>

+  ret i64 %res<br>

+}<br>

+<br>

+; [2 x float] should not be promoted to double by the Darwin varargs handling,<br>

+; but should go in an 8-byte aligned slot.<br>

+define void @test_varargs_stackalign() {<br>

+; CHECK-LABEL: test_varargs_stackalign:<br>

+; CHECK-DARWINPCS: stp {{w[0-9]+}}, {{w[0-9]+}}, [sp, #16]<br>

+<br>

+  call void(...)* @callee([3 x float] undef, [2 x float] [float 1.0, float 2.0])<br>

+  ret void<br>

+}<br>

+<br>

+define i64 @test_smallstruct_block([7 x i64], [2 x i64] %in) {<br>

+; CHECK-LABEL: test_smallstruct_block:<br>

+; CHECK: ldp [[LHS:x[0-9]+]], [[RHS:x[0-9]+]], [sp]<br>

+; CHECK: add x0, [[LHS]], [[RHS]]<br>

+  %lhs = extractvalue [2 x i64] %in, 0<br>

+  %rhs = extractvalue [2 x i64] %in, 1<br>

+  %sum = add i64 %lhs, %rhs<br>

+  ret i64 %sum<br>

+}<br>

+<br>

+; Check that a small struct prevents backfilling of registers (i.e. %rhs<br>

+; must go on the stack rather than in x7).<br>

+define i64 @test_smallstruct_block_consume([7 x i64], [2 x i64] %in, i64 %rhs) {<br>

+; CHECK-LABEL: test_smallstruct_block_consume:<br>

+; CHECK-DAG: ldr [[LHS:x[0-9]+]], [sp]<br>

+; CHECK-DAG: ldr [[RHS:x[0-9]+]], [sp, #16]<br>

+; CHECK: add x0, [[LHS]], [[RHS]]<br>

+<br>

+  %lhs = extractvalue [2 x i64] %in, 0<br>

+  %sum = add i64 %lhs, %rhs<br>

+  ret i64 %sum<br>

+}<br>

<br>

Modified: llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll<br>

URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll?rev=222903&r1=222902&r2=222903&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll?rev=222903&r1=222902&r2=222903&view=diff</a><br>

==============================================================================<br>

--- llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll (original)<br>

+++ llvm/trunk/test/CodeGen/AArch64/arm64-variadic-aapcs.ll Thu Nov 27 15:02:42 2014<br>

@@ -96,7 +96,7 @@ define void @test_nospare([8 x i64], [8<br>

<br>

 ; If there are non-variadic arguments on the stack (here two i64s) then the<br>

 ; __stack field should point just past them.<br>

-define void @test_offsetstack([10 x i64], [3 x float], ...) {<br>

+define void @test_offsetstack([8 x i64], [2 x i64], [3 x float], ...) {<br>

 ; CHECK-LABEL: test_offsetstack:<br>

 ; CHECK: sub sp, sp, #80<br>

 ; CHECK: add [[STACK_TOP:x[0-9]+]], sp, #96<br>

<br>

<br>

_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">Best Regards,<div><br></div><div>Kevin Qin</div></div></div>

</div>