[all-commits] [llvm/llvm-project] 3225fc: [SVE] Deal with SVE tuple call arguments correctly...

david-arm via All-commits all-commits at lists.llvm.org
Thu Nov 12 00:42:31 PST 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 3225fcf11eb7741d4f0b7e891685d7133aae5c7b
      https://github.com/llvm/llvm-project/commit/3225fcf11eb7741d4f0b7e891685d7133aae5c7b
  Author: David Sherwood <david.sherwood at arm.com>
  Date:   2020-11-12 (Thu, 12 Nov 2020)

  Changed paths:
    M llvm/include/llvm/CodeGen/CallingConvLower.h
    M llvm/include/llvm/CodeGen/TargetCallingConv.h
    M llvm/lib/CodeGen/CallingConvLower.cpp
    M llvm/lib/Target/AArch64/AArch64CallingConvention.cpp
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    A llvm/test/CodeGen/AArch64/sve-calling-convention-mixed.ll
    R llvm/test/CodeGen/AArch64/sve-calling-convention-tuples-broken.ll

  Log Message:
  -----------
  [SVE] Deal with SVE tuple call arguments correctly when running out of registers

When passing SVE types as arguments to function calls we can run
out of hardware SVE registers. This is normally fine, since we
switch to an indirect mode where we pass a pointer to a SVE stack
object in a GPR. However, if we switch over part-way through
processing a SVE tuple then part of it will be in registers and
the other part will be on the stack.

I've fixed this by ensuring that:

1. When we don't have enough registers to allocate the whole block
   we mark any remaining SVE registers temporarily as allocated.
2. We temporarily remove the InConsecutiveRegs flags from the last
   tuple part argument and reinvoke the autogenerated calling
   convention handler. Doing this prevents the code from entering
   an infinite recursion and, in combination with 1), ensures we
   switch over to the Indirect mode.
3. After allocating a GPR register for the pointer to the tuple we
   then deallocate any SVE registers we marked as allocated in 1).
   We also set the InConsecutiveRegs flags back how they were before.
4. I've changed the AArch64ISelLowering LowerCALL and
   LowerFormalArguments functions to detect the start of a tuple,
   which involves allocating a single stack object and doing the
   correct numbers of legal loads and stores.

Differential Revision: https://reviews.llvm.org/D90219




More information about the All-commits mailing list