[llvm] y (PR #65434)

via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 5 19:42:45 PDT 2023


https://github.com/lcvon007 created https://github.com/llvm/llvm-project/pull/65434:

The order of stack objects decides the offset size relative to
sp/fp, and shorter offset is more possible to make the related
instructions to be compressed and use less instructions to
build the offset immediate. So it can improve the code size
if we reorder the stack objects using proper cost model.

The precise cost model requires further complexity, and the
overall gain isn't worth it. I reuse X86's cost model that
uses the estimated density, the cost is computed by
density = ObjectNumUses / ObjectSize,
ObjectNumUses is the number of instructions using the frame
object, and the difference between x86 and RISCV is that we
provide the double weight for ld/st instructions because it's
more possible to be compressed. ObjectSize is the size of frame
object.
CodeSize may regress in some testcases if we don't add weight
for ld/st(the reason is that more compressible ld/st get too much
offset to stop them being compressed), and the double weight is
estimate(other maybe better in some cases).
The original allocated objects are splitted into multiple groups
with same alignment size first, and sort each group using the
algorithm that the frame object with higher density
gets shorter offset relative to sp/fp.

Differential Revision: https://reviews.llvm.org/D158623


>From d846e9833fc38df0639860ddd37b4c517846fa03 Mon Sep 17 00:00:00 2001
From: laichunfeng <laichunfeng at tencent.com>
Date: Fri, 18 Aug 2023 14:55:09 +0800
Subject: [PATCH] [RISCV] Reorder the stack objects.

The order of stack objects decides the offset size relative to
sp/fp, and shorter offset is more possible to make the related
instructions to be compressed and use less instructions to
build the offset immediate. So it can improve the code size
if we reorder the stack objects using proper cost model.

The precise cost model requires further complexity, and the
overall gain isn't worth it. I reuse X86's cost model that
uses the estimated density, the cost is computed by
density = ObjectNumUses / ObjectSize,
ObjectNumUses is the number of instructions using the frame
object, and the difference between x86 and RISCV is that we
provide the double weight for ld/st instructions because it's
more possible to be compressed. ObjectSize is the size of frame
object.
CodeSize may regress in some testcases if we don't add weight
for ld/st(the reason is that more compressible ld/st get too much
offset to stop them being compressed), and the double weight is
estimate(other maybe better in some cases).
The original allocated objects are splitted into multiple groups
with same alignment size first, and sort each group using the
algorithm that the frame object with higher density
gets shorter offset relative to sp/fp.

Differential Revision: https://reviews.llvm.org/D158623
---
 llvm/lib/Target/RISCV/RISCVFrameLowering.cpp  | 163 +++++++++
 llvm/lib/Target/RISCV/RISCVFrameLowering.h    |   8 +
 .../CodeGen/RISCV/reorder-frame-objects.mir   | 311 ++++++++++++++++++
 3 files changed, 482 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/reorder-frame-objects.mir

diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
index 0588cfa6dbafbe..dc82f6329588f3 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
@@ -25,6 +25,8 @@
 
 #include <algorithm>
 
+#define DEBUG_TYPE "frame-info"
+
 using namespace llvm;
 
 static const Register AllPopRegs[] = {
@@ -453,6 +455,167 @@ static MCCFIInstruction createDefCFAExpression(const TargetRegisterInfo &TRI,
                                         Comment.str());
 }
 
+// Return true if MI is a load or store for which there exist a compressed
+// version.
+static bool isCompressibleLdOrSt(const MachineInstr &MI) {
+  const RISCVSubtarget &STI = MI.getMF()->getSubtarget<RISCVSubtarget>();
+  switch (MI.getOpcode()) {
+  case RISCV::LW:
+  case RISCV::SW:
+  case RISCV::LD:
+  case RISCV::SD:
+    if (STI.hasStdExtCOrZca() || STI.hasStdExtZce())
+      return true;
+    break;
+  case RISCV::FLW:
+  case RISCV::FSW:
+    // C.FLW/C.FSW/C.FLWSP/C.SWSP is only supported by RV32FC
+    if ((STI.hasStdExtC() || STI.hasStdExtZcf() || (STI.hasStdExtZce())) &&
+        (STI.getFLen() == 32))
+      return true;
+    break;
+  case RISCV::FLD:
+  case RISCV::FSD:
+    // C.FLD/C.FSD/C.FLDSP/C.FSDSP is only supported by RV32DC and RV64DC
+    if ((STI.hasStdExtC() || STI.hasStdExtZcd()) && STI.getFLen() <= 64)
+      return true;
+    break;
+  default:
+    return false;
+  }
+  return false;
+}
+
+void RISCVFrameLowering::orderFrameObjects(
+    const MachineFunction &MF, SmallVectorImpl<int> &ObjectsToAllocate) const {
+  const MachineFrameInfo &MFI = MF.getFrameInfo();
+  const RISCVRegisterInfo *RI = STI.getRegisterInfo();
+  // It's only used to reduce codesize.
+  if (!MF.getFunction().hasOptSize())
+    return;
+  // Don't waste time if there's nothing to do.
+  if (ObjectsToAllocate.empty())
+    return;
+
+  // Struct that helps sort the stack objects.
+  struct RISCVFrameSortingObject {
+    unsigned ObjectIndex = 0;         // Index of Object in MFI list.
+    unsigned ObjectSize = 0;          // Size of Object in bytes
+    Align ObjectAlignment = Align(1); // Alignment of Object in bytes.
+    unsigned ObjectNumUses = 0;       // Object static number of uses.
+  };
+
+  // Key: index of object in MFI list.
+  // Value: index of sorting object in SortingObjects vector.
+  DenseMap<int, unsigned> ObjIdxToSortIdx;
+  std::vector<RISCVFrameSortingObject> SortingObjects(ObjectsToAllocate.size());
+
+  // Init SortingObjects.
+  // The stack address of dynamic objects(size is zero) is only affected by
+  // total stack size, so it doesn't need to handle it specially.
+  for (const auto &[Idx, Obj] : enumerate(ObjectsToAllocate)) {
+    SortingObjects[Idx].ObjectIndex = Obj;
+    SortingObjects[Idx].ObjectAlignment = MFI.getObjectAlign(Obj);
+    SortingObjects[Idx].ObjectSize = MFI.getObjectSize(Obj);
+    // Save index mapping info.
+    ObjIdxToSortIdx[Obj] = Idx;
+  }
+
+  // Count the number of uses for each object.
+  for (auto &MBB : MF) {
+    for (auto &MI : MBB) {
+      if (MI.isDebugInstr())
+        continue;
+      for (const MachineOperand &MO : MI.operands()) {
+        // Check to see if it's a local stack symbol.
+        if (!MO.isFI())
+          continue;
+        int Index = MO.getIndex();
+        // Check to see if it falls within our range, and is tagged
+        // to require ordering.
+        if (Index >= 0 && Index < MFI.getObjectIndexEnd()) {
+          if (ObjIdxToSortIdx.find(Index) != ObjIdxToSortIdx.end()) {
+            if (isCompressibleLdOrSt(MI))
+              // ld/st is more possible to be compressed so increase its
+              // weight and 2 is estimate.
+              SortingObjects[ObjIdxToSortIdx[Index]].ObjectNumUses += 2;
+            else
+              SortingObjects[ObjIdxToSortIdx[Index]].ObjectNumUses++;
+          }
+        }
+      }
+    }
+  }
+
+  bool UseSpAsBase = true;
+  // Access offset of the FP.
+  if (!RI->hasStackRealignment(MF) && hasFP(MF))
+    UseSpAsBase = false;
+
+  // Split SortingObjects into multiple groups that have objects with
+  // same alignment and sort them in each group to avoid increasing
+  // extra padding.
+  // For example, supposed that each alignment size of objects in
+  // SortingObjets is as follows:
+  // 1B 1B 4B 1B 4B 4B
+  // They're splitted into four groups:
+  // group0(1B, 1B)
+  // group1(4B)
+  // group2(1B)
+  // group3(4B, 4B)
+  for (auto SortBegin = SortingObjects.begin(), SortEnd = SortingObjects.end();
+       SortBegin != SortEnd;) {
+    auto SortGroupEnd = std::next(SortBegin);
+    while (SortGroupEnd != SortingObjects.end() &&
+           SortGroupEnd->ObjectAlignment == SortBegin->ObjectAlignment)
+      ++SortGroupEnd;
+    // The current comparison algorithm is to use an estimated
+    // "density". This takes into consideration the size and number of
+    // uses each object has in order to roughly minimize code size.
+    // So, for example, an object of size 16B that is referenced 5 times
+    // will get higher priority than 4B objects referenced 1 time.
+    // The stack symbols with higher piority have shorter offset relative
+    // to sp/fp so that stack related instructions about them are more
+    // possible to be improved.
+    std::stable_sort(SortBegin, SortGroupEnd,
+                     [&UseSpAsBase](const RISCVFrameSortingObject &A,
+                                    const RISCVFrameSortingObject &B) {
+                       uint64_t DensityAScaled, DensityBScaled;
+
+                       // The density is calculated by doing :
+                       //     (double)DensityA = A.ObjectNumUses / A.ObjectSize
+                       //     (double)DensityB = B.ObjectNumUses / B.ObjectSize
+                       // Since this approach may cause inconsistencies in
+                       // the floating point <, >, == comparisons, depending on
+                       // the floating point model with which the compiler was
+                       // built, we're going to scale both sides by multiplying
+                       // with A.ObjectSize * B.ObjectSize. This ends up
+                       // factoring away the division and, with it, the need for
+                       // any floating point arithmetic.
+                       DensityAScaled = static_cast<uint64_t>(A.ObjectNumUses) *
+                                        static_cast<uint64_t>(B.ObjectSize);
+                       DensityBScaled = static_cast<uint64_t>(B.ObjectNumUses) *
+                                        static_cast<uint64_t>(A.ObjectSize);
+                       // Make sure object with highest density is cloest to
+                       // sp/fp.
+                       return UseSpAsBase ? DensityAScaled < DensityBScaled
+                                          : DensityAScaled > DensityBScaled;
+                     });
+
+    SortBegin = SortGroupEnd;
+  }
+  // Now modify the original list to represent the final order that
+  // we want.
+  for (const auto &[Idx, Obj] : enumerate(SortingObjects)) {
+    ObjectsToAllocate[Idx] = Obj.ObjectIndex;
+  }
+
+  LLVM_DEBUG(dbgs() << "Final frame order:\n"; for (auto &Obj
+                                                    : ObjectsToAllocate) {
+    dbgs() << "Frame object index: " << Obj << "\n";
+  });
+}
+
 void RISCVFrameLowering::emitPrologue(MachineFunction &MF,
                                       MachineBasicBlock &MBB) const {
   MachineFrameInfo &MFI = MF.getFrameInfo();
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.h b/llvm/lib/Target/RISCV/RISCVFrameLowering.h
index 9bc100981f2f7b..c79951ff1a4c55 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.h
@@ -79,6 +79,14 @@ class RISCVFrameLowering : public TargetFrameLowering {
     return StackId != TargetStackID::ScalableVector;
   }
 
+  /// Order the symbols in the local stack.
+  /// We want to place the local stack objects in some sort of sensible order.
+  /// The heuristic we use is to try and pack them according to static number
+  /// of uses(hot).
+  void
+  orderFrameObjects(const MachineFunction &MF,
+                    SmallVectorImpl<int> &ObjectsToAllocate) const override;
+
 protected:
   const RISCVSubtarget &STI;
 
diff --git a/llvm/test/CodeGen/RISCV/reorder-frame-objects.mir b/llvm/test/CodeGen/RISCV/reorder-frame-objects.mir
new file mode 100644
index 00000000000000..92a7a3ff6cfd9e
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/reorder-frame-objects.mir
@@ -0,0 +1,311 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
+# RUN: llc -march=riscv64 -x mir -run-pass=prologepilog -stack-symbol-ordering=0 \
+# RUN: -verify-machineinstrs < %s | FileCheck -check-prefixes=CHECK-RV64-NO-REORDER %s
+# RUN: llc -march=riscv64 -x mir -run-pass=prologepilog \
+# RUN: -verify-machineinstrs < %s | FileCheck -check-prefixes=CHECK-RV64-REORDER %s
+--- |
+
+  define dso_local void @_Z12stack_use_spv() local_unnamed_addr #0 {
+  entry:
+    ret void
+  }
+
+  declare dso_local void @_Z7callee0Pi(ptr noundef) local_unnamed_addr #0
+
+  declare dso_local void @_Z7callee1Pc(ptr noundef) local_unnamed_addr #0
+
+  define dso_local void @_Z12stack_use_fpjj(i32 noundef signext %m, i32 noundef signext %n) local_unnamed_addr #0 {
+  entry:
+    ret void
+  }
+
+  attributes #0 = { minsize optsize }
+
+...
+---
+name:            _Z12stack_use_spv
+alignment:       2
+tracksRegLiveness: true
+tracksDebugUserValues: true
+frameInfo:
+  maxAlignment:    4
+  hasCalls:        true
+  localFrameSize:  2072
+stack:
+  - { id: 0, size: 4, alignment: 4, local-offset: -4 }
+  - { id: 1, size: 1, alignment: 1, local-offset: -5 }
+  - { id: 2, size: 16, alignment: 4, local-offset: -24 }
+  - { id: 3, size: 2048, alignment: 4, local-offset: -2072 }
+machineFunctionInfo:
+  varArgsFrameIndex: 0
+  varArgsSaveSize: 0
+body:             |
+  bb.0.entry:
+    ; CHECK-RV64-NO-REORDER-LABEL: name: _Z12stack_use_spv
+    ; CHECK-RV64-NO-REORDER: liveins: $x1
+    ; CHECK-RV64-NO-REORDER-NEXT: {{  $}}
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-setup ADDI $x2, -2032
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 2032
+    ; CHECK-RV64-NO-REORDER-NEXT: SD killed $x1, $x2, 2024 :: (store (s64) into %stack.4)
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x1, -8
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-setup ADDI $x2, -64
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 2096
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, 37
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, 36
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee1Pc, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, 17
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, 17
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, 17
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, 17
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, 17
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x2, 16
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-destroy ADDI $x2, 64
+    ; CHECK-RV64-NO-REORDER-NEXT: $x1 = LD $x2, 2024 :: (load (s64) from %stack.4)
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-destroy ADDI $x2, 2032
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoRET
+    ;
+    ; CHECK-RV64-REORDER-LABEL: name: _Z12stack_use_spv
+    ; CHECK-RV64-REORDER: liveins: $x1
+    ; CHECK-RV64-REORDER-NEXT: {{  $}}
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-setup ADDI $x2, -2032
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 2032
+    ; CHECK-RV64-REORDER-NEXT: SD killed $x1, $x2, 2024 :: (store (s64) into %stack.4)
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x1, -8
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-setup ADDI $x2, -64
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 2096
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI killed $x10, 37
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x2, 2047
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI killed $x10, 36
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee1Pc, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x2, 16
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x2, 16
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x2, 16
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x2, 16
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x2, 16
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x2, 32
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-destroy ADDI $x2, 64
+    ; CHECK-RV64-REORDER-NEXT: $x1 = LD $x2, 2024 :: (load (s64) from %stack.4)
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-destroy ADDI $x2, 2032
+    ; CHECK-RV64-REORDER-NEXT: PseudoRET
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.0, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.1, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee1Pc, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.2, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.2, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.2, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.2, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.2, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.3, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit killed $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    PseudoRET
+
+...
+---
+name:            _Z12stack_use_fpjj
+alignment:       2
+tracksRegLiveness: true
+tracksDebugUserValues: true
+liveins:
+  - { reg: '$x10' }
+  - { reg: '$x11' }
+frameInfo:
+  maxAlignment:    4
+  hasCalls:        true
+  localFrameSize:  2068
+stack:
+  - { id: 0, size: 2064, alignment: 4, local-offset: -2064 }
+  - { id: 1, size: 4, alignment: 4, local-offset: -2068 }
+  - { id: 2, type: variable-sized, alignment: 1, local-offset: -2068 }
+  - { id: 3, type: variable-sized, alignment: 1, local-offset: -2068 }
+machineFunctionInfo:
+  varArgsFrameIndex: 0
+  varArgsSaveSize: 0
+body:             |
+  bb.0.entry:
+    liveins: $x10, $x11
+
+    ; CHECK-RV64-NO-REORDER-LABEL: name: _Z12stack_use_fpjj
+    ; CHECK-RV64-NO-REORDER: liveins: $x10, $x11, $x1, $x9, $x18, $x19
+    ; CHECK-RV64-NO-REORDER-NEXT: {{  $}}
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-setup ADDI $x2, -2032
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 2032
+    ; CHECK-RV64-NO-REORDER-NEXT: SD killed $x1, $x2, 2024 :: (store (s64) into %stack.4)
+    ; CHECK-RV64-NO-REORDER-NEXT: SD killed $x8, $x2, 2016 :: (store (s64) into %stack.5)
+    ; CHECK-RV64-NO-REORDER-NEXT: SD killed $x9, $x2, 2008 :: (store (s64) into %stack.6)
+    ; CHECK-RV64-NO-REORDER-NEXT: SD killed $x18, $x2, 2000 :: (store (s64) into %stack.7)
+    ; CHECK-RV64-NO-REORDER-NEXT: SD killed $x19, $x2, 1992 :: (store (s64) into %stack.8)
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x1, -8
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x8, -16
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x9, -24
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x18, -32
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x19, -40
+    ; CHECK-RV64-NO-REORDER-NEXT: $x8 = frame-setup ADDI $x2, 2032
+    ; CHECK-RV64-NO-REORDER-NEXT: frame-setup CFI_INSTRUCTION def_cfa $x8, 0
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-setup ADDI $x2, -96
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x19 = COPY $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x10 = SLLI killed renamable $x10, 32
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x10 = SRLI killed renamable $x10, 30
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x10 = nuw ADDI killed renamable $x10, 15
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x10 = ANDI killed renamable $x10, -16
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x18 = SUB $x2, killed renamable $x10
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = COPY renamable $x18
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x11 = SLLI killed renamable $x11, 32
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x11 = SRLI killed renamable $x11, 30
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x11 = nuw ADDI killed renamable $x11, 15
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x11 = ANDI killed renamable $x11, -16
+    ; CHECK-RV64-NO-REORDER-NEXT: renamable $x9 = SUB $x2, killed renamable $x11
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = COPY renamable $x9
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x8, -2048
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, -64
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = COPY killed renamable $x18
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = COPY killed renamable $x9
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI $x8, -2048
+    ; CHECK-RV64-NO-REORDER-NEXT: $x10 = ADDI killed $x10, -68
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = COPY killed renamable $x19
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-destroy ADDI $x8, -2048
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-destroy ADDI killed $x2, -80
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-destroy ADDI $x2, 96
+    ; CHECK-RV64-NO-REORDER-NEXT: $x1 = LD $x2, 2024 :: (load (s64) from %stack.4)
+    ; CHECK-RV64-NO-REORDER-NEXT: $x8 = LD $x2, 2016 :: (load (s64) from %stack.5)
+    ; CHECK-RV64-NO-REORDER-NEXT: $x9 = LD $x2, 2008 :: (load (s64) from %stack.6)
+    ; CHECK-RV64-NO-REORDER-NEXT: $x18 = LD $x2, 2000 :: (load (s64) from %stack.7)
+    ; CHECK-RV64-NO-REORDER-NEXT: $x19 = LD $x2, 1992 :: (load (s64) from %stack.8)
+    ; CHECK-RV64-NO-REORDER-NEXT: $x2 = frame-destroy ADDI $x2, 2032
+    ; CHECK-RV64-NO-REORDER-NEXT: PseudoRET
+    ;
+    ; CHECK-RV64-REORDER-LABEL: name: _Z12stack_use_fpjj
+    ; CHECK-RV64-REORDER: liveins: $x10, $x11, $x1, $x9, $x18, $x19
+    ; CHECK-RV64-REORDER-NEXT: {{  $}}
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-setup ADDI $x2, -2032
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 2032
+    ; CHECK-RV64-REORDER-NEXT: SD killed $x1, $x2, 2024 :: (store (s64) into %stack.4)
+    ; CHECK-RV64-REORDER-NEXT: SD killed $x8, $x2, 2016 :: (store (s64) into %stack.5)
+    ; CHECK-RV64-REORDER-NEXT: SD killed $x9, $x2, 2008 :: (store (s64) into %stack.6)
+    ; CHECK-RV64-REORDER-NEXT: SD killed $x18, $x2, 2000 :: (store (s64) into %stack.7)
+    ; CHECK-RV64-REORDER-NEXT: SD killed $x19, $x2, 1992 :: (store (s64) into %stack.8)
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x1, -8
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x8, -16
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x9, -24
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x18, -32
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION offset $x19, -40
+    ; CHECK-RV64-REORDER-NEXT: $x8 = frame-setup ADDI $x2, 2032
+    ; CHECK-RV64-REORDER-NEXT: frame-setup CFI_INSTRUCTION def_cfa $x8, 0
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-setup ADDI $x2, -96
+    ; CHECK-RV64-REORDER-NEXT: renamable $x19 = COPY $x2
+    ; CHECK-RV64-REORDER-NEXT: renamable $x10 = SLLI killed renamable $x10, 32
+    ; CHECK-RV64-REORDER-NEXT: renamable $x10 = SRLI killed renamable $x10, 30
+    ; CHECK-RV64-REORDER-NEXT: renamable $x10 = nuw ADDI killed renamable $x10, 15
+    ; CHECK-RV64-REORDER-NEXT: renamable $x10 = ANDI killed renamable $x10, -16
+    ; CHECK-RV64-REORDER-NEXT: renamable $x18 = SUB $x2, killed renamable $x10
+    ; CHECK-RV64-REORDER-NEXT: $x2 = COPY renamable $x18
+    ; CHECK-RV64-REORDER-NEXT: renamable $x11 = SLLI killed renamable $x11, 32
+    ; CHECK-RV64-REORDER-NEXT: renamable $x11 = SRLI killed renamable $x11, 30
+    ; CHECK-RV64-REORDER-NEXT: renamable $x11 = nuw ADDI killed renamable $x11, 15
+    ; CHECK-RV64-REORDER-NEXT: renamable $x11 = ANDI killed renamable $x11, -16
+    ; CHECK-RV64-REORDER-NEXT: renamable $x9 = SUB $x2, killed renamable $x11
+    ; CHECK-RV64-REORDER-NEXT: $x2 = COPY renamable $x9
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x8, -2048
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI killed $x10, -68
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = COPY killed renamable $x18
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = COPY killed renamable $x9
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x10 = ADDI $x8, -52
+    ; CHECK-RV64-REORDER-NEXT: PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ; CHECK-RV64-REORDER-NEXT: $x2 = COPY killed renamable $x19
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-destroy ADDI $x8, -2048
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-destroy ADDI killed $x2, -80
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-destroy ADDI $x2, 96
+    ; CHECK-RV64-REORDER-NEXT: $x1 = LD $x2, 2024 :: (load (s64) from %stack.4)
+    ; CHECK-RV64-REORDER-NEXT: $x8 = LD $x2, 2016 :: (load (s64) from %stack.5)
+    ; CHECK-RV64-REORDER-NEXT: $x9 = LD $x2, 2008 :: (load (s64) from %stack.6)
+    ; CHECK-RV64-REORDER-NEXT: $x18 = LD $x2, 2000 :: (load (s64) from %stack.7)
+    ; CHECK-RV64-REORDER-NEXT: $x19 = LD $x2, 1992 :: (load (s64) from %stack.8)
+    ; CHECK-RV64-REORDER-NEXT: $x2 = frame-destroy ADDI $x2, 2032
+    ; CHECK-RV64-REORDER-NEXT: PseudoRET
+    renamable $x19 = COPY $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    renamable $x10 = SLLI killed renamable $x10, 32
+    renamable $x10 = SRLI killed renamable $x10, 30
+    renamable $x10 = nuw ADDI killed renamable $x10, 15
+    renamable $x10 = ANDI killed renamable $x10, -16
+    renamable $x18 = SUB $x2, killed renamable $x10
+    $x2 = COPY renamable $x18
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    renamable $x11 = SLLI killed renamable $x11, 32
+    renamable $x11 = SRLI killed renamable $x11, 30
+    renamable $x11 = nuw ADDI killed renamable $x11, 15
+    renamable $x11 = ANDI killed renamable $x11, -16
+    renamable $x9 = SUB $x2, killed renamable $x11
+    $x2 = COPY renamable $x9
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.0, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = COPY killed renamable $x18
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = COPY killed renamable $x9
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+    $x10 = ADDI %stack.1, 0
+    PseudoCALL target-flags(riscv-call) @_Z7callee0Pi, csr_ilp32_lp64, implicit-def dead $x1, implicit $x10, implicit-def $x2
+    ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+    $x2 = COPY killed renamable $x19
+    PseudoRET
+
+...



More information about the llvm-commits mailing list