[llvm-branch-commits] [llvm] [AArch64][SME] Support split ZPR and PPR area allocation (PR #142392)
via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Mon Jun 2 07:01:14 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-aarch64
Author: Benjamin Maxwell (MacDue)
<details>
<summary>Changes</summary>
For a while we have supported the `-aarch64-stack-hazard-size=<size>` option, which adds "hazard padding" between GPRs and FPR/ZPRs. However, there is currently a hole in this mitigation as PPR and FPR/ZPR accesses to the same area also cause streaming memory hazards (this is noted by `-pass-remarks-analysis=sme -aarch64-stack-hazard-remark-size=<val>`), and the current stack layout places PPRs and ZPRs within the same area.
Which looks like:
```
------------------------------------ Higher address
| callee-saved gpr registers |
|---------------------------------- |
| lr,fp (a.k.a. "frame record") |
|-----------------------------------| <- fp(=x29)
| <hazard padding> |
|-----------------------------------|
| callee-saved fp/simd/SVE regs |
|-----------------------------------|
| SVE stack objects |
|-----------------------------------|
| local variables of fixed size |
| <FPR> |
| <hazard padding> |
| <GPR> |
------------------------------------| <- sp
| Lower address
```
With this patch the stack (and hazard padding) is rearranged so that hazard padding is placed between the PPRs and ZPRs rather than within the (fixed size) callee-save region. Which looks something like this:
```
------------------------------------ Higher address
| callee-saved gpr registers |
|---------------------------------- |
| lr,fp (a.k.a. "frame record") |
|-----------------------------------| <- fp(=x29)
| callee-saved PPRs |
| PPR stack objects | (These are SVE predicates)
|-----------------------------------|
| <hazard padding> |
|-----------------------------------|
| callee-saved ZPR regs | (These are SVE vectors)
| ZPR stack objects | Note: FPRs are promoted to ZPRs
|-----------------------------------|
| local variables of fixed size |
| <FPR> |
| <hazard padding> |
| <GPR> |
------------------------------------| <- sp
| Lower address
```
This layout is only enabled if:
* SplitSVEObjects are enabled (`-aarch64-split-sve-objects`)
- (This may be enabled by default in a later patch)
* Streaming memory hazards are present
- (`-aarch64-stack-hazard-size=<val>` != 0)
* PPRs and FPRs/ZPRs are on the stack
* There's no stack realignment or variable-sized objects
- This is left as a TODO for now
Additionally, any FPR callee-saves that are present will be promoted to ZPRs. This is to prevent stack hazards between FPRs and GRPs in the fixed size callee-save area (which would otherwise require more hazard padding, or moving the FPR callee-saves).
This layout should resolve the hole in the hazard padding mitigation, and is not intended change codegen for non-SME code.
---
Patch is 172.93 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142392.diff
8 Files Affected:
- (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+391-133)
- (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.h (+2-1)
- (modified) llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h (+7)
- (added) llvm/test/CodeGen/AArch64/framelayout-split-sve.mir (+526)
- (modified) llvm/test/CodeGen/AArch64/spill-fill-zpr-predicates.mir (+7-10)
- (added) llvm/test/CodeGen/AArch64/split-sve-stack-frame-layout.ll (+751)
- (modified) llvm/test/CodeGen/AArch64/stack-hazard.ll (+542-326)
- (modified) llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll (-2)
``````````diff
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index e5592a921e192..36775b8dc05e5 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -270,6 +270,11 @@ static cl::opt<bool> OrderFrameObjects("aarch64-order-frame-objects",
cl::desc("sort stack allocations"),
cl::init(true), cl::Hidden);
+static cl::opt<bool>
+ SplitSVEObjects("aarch64-split-sve-objects",
+ cl::desc("Split allocation of ZPR & PPR objects"),
+ cl::init(false), cl::Hidden);
+
cl::opt<bool> EnableHomogeneousPrologEpilog(
"homogeneous-prolog-epilog", cl::Hidden,
cl::desc("Emit homogeneous prologue and epilogue for the size "
@@ -458,10 +463,13 @@ static StackOffset getZPRStackSize(const MachineFunction &MF) {
return StackOffset::getScalable(AFI->getStackSizeZPR());
}
-/// Returns the size of the entire PPR stackframe (calleesaves + spills).
+/// Returns the size of the entire PPR stackframe (calleesaves + spills + hazard
+/// padding).
static StackOffset getPPRStackSize(const MachineFunction &MF) {
const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
- return StackOffset::getScalable(AFI->getStackSizePPR());
+ return StackOffset::get(AFI->hasSplitSVEObjects() ? getStackHazardSize(MF)
+ : 0,
+ AFI->getStackSizePPR());
}
/// Returns the size of the entire SVE stackframe (PPRs + ZPRs).
@@ -484,6 +492,10 @@ bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {
if (!EnableRedZone)
return false;
+ const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
+ if (AFI->hasSplitSVEObjects())
+ return false;
+
// Don't use the red zone if the function explicitly asks us not to.
// This is typically used for kernel code.
const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
@@ -493,7 +505,6 @@ bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {
return false;
const MachineFrameInfo &MFI = MF.getFrameInfo();
- const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
uint64_t NumBytes = AFI->getLocalStackSize();
// If neither NEON or SVE are available, a COPY from one Q-reg to
@@ -666,9 +677,11 @@ void AArch64FrameLowering::emitCalleeSavedSVELocations(
const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
AArch64FunctionInfo &AFI = *MF.getInfo<AArch64FunctionInfo>();
CFIInstBuilder CFIBuilder(MBB, MBBI, MachineInstr::FrameSetup);
+ StackOffset PPRStackSize = getPPRStackSize(MF);
for (const auto &Info : CSI) {
- if (!MFI.isScalableStackID(Info.getFrameIdx()))
+ int FI = Info.getFrameIdx();
+ if (!MFI.isScalableStackID(FI))
continue;
// Not all unwinders may know about SVE registers, so assume the lowest
@@ -679,9 +692,13 @@ void AArch64FrameLowering::emitCalleeSavedSVELocations(
continue;
StackOffset Offset =
- StackOffset::getScalable(MFI.getObjectOffset(Info.getFrameIdx())) -
+ StackOffset::getScalable(MFI.getObjectOffset(FI)) -
StackOffset::getFixed(AFI.getCalleeSavedStackSize(MFI));
+ if (AFI.hasSplitSVEObjects() &&
+ MFI.getStackID(FI) == TargetStackID::ScalableVector)
+ Offset -= PPRStackSize;
+
CFIBuilder.insertCFIInst(createCFAOffset(TRI, Reg, Offset));
}
}
@@ -2188,33 +2205,68 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes);
StackOffset LocalsSize =
PPRLocalsSize + ZPRLocalsSize + StackOffset::getFixed(NumBytes);
- StackOffset SVECalleeSavesSize = PPRCalleeSavesSize + ZPRCalleeSavesSize;
- MachineBasicBlock::iterator CalleeSavesBegin =
- AFI->getPPRCalleeSavedStackSize() ? PPRCalleeSavesBegin
- : ZPRCalleeSavesBegin;
- allocateStackSpace(MBB, CalleeSavesBegin, 0, SVECalleeSavesSize, false,
- nullptr, EmitAsyncCFI && !HasFP, CFAOffset,
- MFI.hasVarSizedObjects() || LocalsSize);
- CFAOffset += SVECalleeSavesSize;
-
- MachineBasicBlock::iterator CalleeSavesEnd =
- AFI->getZPRCalleeSavedStackSize() ? ZPRCalleeSavesEnd : PPRCalleeSavesEnd;
- if (EmitAsyncCFI)
- emitCalleeSavedSVELocations(MBB, CalleeSavesEnd);
-
- // Allocate space for the rest of the frame including SVE locals. Align the
- // stack as necessary.
- assert(!(canUseRedZone(MF) && NeedsRealignment) &&
- "Cannot use redzone with stack realignment");
- if (!canUseRedZone(MF)) {
- // FIXME: in the case of dynamic re-alignment, NumBytes doesn't have
- // the correct value here, as NumBytes also includes padding bytes,
- // which shouldn't be counted here.
- StackOffset SVELocalsSize = PPRLocalsSize + ZPRLocalsSize;
- allocateStackSpace(MBB, CalleeSavesEnd, RealignmentPadding,
- SVELocalsSize + StackOffset::getFixed(NumBytes),
- NeedsWinCFI, &HasWinCFI, EmitAsyncCFI && !HasFP,
- CFAOffset, MFI.hasVarSizedObjects());
+ if (!AFI->hasSplitSVEObjects()) {
+ StackOffset SVECalleeSavesSize = PPRCalleeSavesSize + ZPRCalleeSavesSize;
+ MachineBasicBlock::iterator CalleeSavesBegin =
+ AFI->getPPRCalleeSavedStackSize() ? PPRCalleeSavesBegin
+ : ZPRCalleeSavesBegin;
+ allocateStackSpace(MBB, CalleeSavesBegin, 0, SVECalleeSavesSize, false,
+ nullptr, EmitAsyncCFI && !HasFP, CFAOffset,
+ MFI.hasVarSizedObjects() || LocalsSize);
+ CFAOffset += SVECalleeSavesSize;
+
+ MachineBasicBlock::iterator CalleeSavesEnd =
+ AFI->getZPRCalleeSavedStackSize() ? ZPRCalleeSavesEnd
+ : PPRCalleeSavesEnd;
+ if (EmitAsyncCFI)
+ emitCalleeSavedSVELocations(MBB, CalleeSavesEnd);
+
+ // Allocate space for the rest of the frame including SVE locals. Align the
+ // stack as necessary.
+ assert(!(canUseRedZone(MF) && NeedsRealignment) &&
+ "Cannot use redzone with stack realignment");
+ if (!canUseRedZone(MF)) {
+ // FIXME: in the case of dynamic re-alignment, NumBytes doesn't have
+ // the correct value here, as NumBytes also includes padding bytes,
+ // which shouldn't be counted here.
+ StackOffset SVELocalsSize = PPRLocalsSize + ZPRLocalsSize;
+ allocateStackSpace(MBB, CalleeSavesEnd, RealignmentPadding,
+ SVELocalsSize + StackOffset::getFixed(NumBytes),
+ NeedsWinCFI, &HasWinCFI, EmitAsyncCFI && !HasFP,
+ CFAOffset, MFI.hasVarSizedObjects());
+ }
+ } else {
+ assert(!canUseRedZone(MF) &&
+ "Cannot use redzone with aarch64-split-sve-objects");
+ // TODO: Handle HasWinCFI/NeedsWinCFI?
+ assert(!NeedsWinCFI &&
+ "WinCFI with aarch64-split-sve-objects is not supported");
+
+ // Split ZPR and PPR allocation.
+ // Allocate PPR callee saves
+ allocateStackSpace(MBB, PPRCalleeSavesBegin, 0, PPRCalleeSavesSize, false,
+ nullptr, EmitAsyncCFI && !HasFP, CFAOffset,
+ MFI.hasVarSizedObjects() || ZPRCalleeSavesSize ||
+ ZPRLocalsSize || PPRLocalsSize);
+ CFAOffset += PPRCalleeSavesSize;
+
+ // Allocate PPR locals + ZPR callee saves
+ assert(PPRCalleeSavesEnd == ZPRCalleeSavesBegin &&
+ "Expected ZPR callee saves after PPR locals");
+ allocateStackSpace(MBB, PPRCalleeSavesEnd, RealignmentPadding,
+ PPRLocalsSize + ZPRCalleeSavesSize, false, nullptr,
+ EmitAsyncCFI && !HasFP, CFAOffset,
+ MFI.hasVarSizedObjects() || ZPRLocalsSize);
+ CFAOffset += PPRLocalsSize + ZPRCalleeSavesSize;
+
+ // Allocate ZPR locals
+ allocateStackSpace(MBB, ZPRCalleeSavesEnd, RealignmentPadding,
+ ZPRLocalsSize + StackOffset::getFixed(NumBytes), false,
+ nullptr, EmitAsyncCFI && !HasFP, CFAOffset,
+ MFI.hasVarSizedObjects());
+
+ if (EmitAsyncCFI)
+ emitCalleeSavedSVELocations(MBB, ZPRCalleeSavesEnd);
}
// If we need a base pointer, set it up here. It's whatever the value of the
@@ -2456,7 +2508,9 @@ void AArch64FrameLowering::emitEpilogue(MachineFunction &MF,
}
}
- StackOffset SVEStackSize = getSVEStackSize(MF);
+ StackOffset ZPRStackSize = getZPRStackSize(MF);
+ StackOffset PPRStackSize = getPPRStackSize(MF);
+ StackOffset SVEStackSize = ZPRStackSize + PPRStackSize;
// If there is a single SP update, insert it before the ret and we're done.
if (CombineSPBump) {
@@ -2477,69 +2531,141 @@ void AArch64FrameLowering::emitEpilogue(MachineFunction &MF,
NumBytes -= PrologueSaveSize;
assert(NumBytes >= 0 && "Negative stack allocation size!?");
- // Process the SVE callee-saves to determine what space needs to be
- // deallocated.
- StackOffset DeallocateBefore = {}, DeallocateAfter = SVEStackSize;
- MachineBasicBlock::iterator RestoreBegin = LastPopI, RestoreEnd = LastPopI;
- int64_t ZPRCalleeSavedSize = AFI->getZPRCalleeSavedStackSize();
- int64_t PPRCalleeSavedSize = AFI->getPPRCalleeSavedStackSize();
- int64_t SVECalleeSavedSize = ZPRCalleeSavedSize + PPRCalleeSavedSize;
-
- if (SVECalleeSavedSize) {
- RestoreBegin = std::prev(RestoreEnd);
- while (RestoreBegin != MBB.begin() &&
- IsSVECalleeSave(std::prev(RestoreBegin)))
- --RestoreBegin;
-
- assert(IsSVECalleeSave(RestoreBegin) &&
- IsSVECalleeSave(std::prev(RestoreEnd)) && "Unexpected instruction");
-
- StackOffset CalleeSavedSizeAsOffset =
- StackOffset::getScalable(SVECalleeSavedSize);
- DeallocateBefore = SVEStackSize - CalleeSavedSizeAsOffset;
- DeallocateAfter = CalleeSavedSizeAsOffset;
- }
-
- // Deallocate the SVE area.
- if (SVEStackSize) {
- // If we have stack realignment or variable sized objects on the stack,
- // restore the stack pointer from the frame pointer prior to SVE CSR
- // restoration.
- if (AFI->isStackRealigned() || MFI.hasVarSizedObjects()) {
- if (SVECalleeSavedSize) {
- // Set SP to start of SVE callee-save area from which they can
- // be reloaded. The code below will deallocate the stack space
- // space by moving FP -> SP.
- emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::FP,
- StackOffset::getScalable(-SVECalleeSavedSize), TII,
- MachineInstr::FrameDestroy);
- }
- } else {
- if (SVECalleeSavedSize) {
- // Deallocate the non-SVE locals first before we can deallocate (and
- // restore callee saves) from the SVE area.
- emitFrameOffset(
- MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
- StackOffset::getFixed(NumBytes), TII, MachineInstr::FrameDestroy,
- false, false, nullptr, EmitCFI && !hasFP(MF),
- SVEStackSize + StackOffset::getFixed(NumBytes + PrologueSaveSize));
- NumBytes = 0;
+ if (!AFI->hasSplitSVEObjects()) {
+ // Process the SVE callee-saves to determine what space needs to be
+ // deallocated.
+ StackOffset DeallocateBefore = {}, DeallocateAfter = SVEStackSize;
+ MachineBasicBlock::iterator RestoreBegin = LastPopI, RestoreEnd = LastPopI;
+ int64_t ZPRCalleeSavedSize = AFI->getZPRCalleeSavedStackSize();
+ int64_t PPRCalleeSavedSize = AFI->getPPRCalleeSavedStackSize();
+ int64_t SVECalleeSavedSize = ZPRCalleeSavedSize + PPRCalleeSavedSize;
+
+ if (SVECalleeSavedSize) {
+ RestoreBegin = std::prev(RestoreEnd);
+ while (RestoreBegin != MBB.begin() &&
+ IsSVECalleeSave(std::prev(RestoreBegin)))
+ --RestoreBegin;
+
+ assert(IsSVECalleeSave(RestoreBegin) &&
+ IsSVECalleeSave(std::prev(RestoreEnd)) &&
+ "Unexpected instruction");
+
+ StackOffset CalleeSavedSizeAsOffset =
+ StackOffset::getScalable(SVECalleeSavedSize);
+ DeallocateBefore = SVEStackSize - CalleeSavedSizeAsOffset;
+ DeallocateAfter = CalleeSavedSizeAsOffset;
+ }
+
+ // Deallocate the SVE area.
+ if (SVEStackSize) {
+ // If we have stack realignment or variable sized objects on the stack,
+ // restore the stack pointer from the frame pointer prior to SVE CSR
+ // restoration.
+ if (AFI->isStackRealigned() || MFI.hasVarSizedObjects()) {
+ if (SVECalleeSavedSize) {
+ // Set SP to start of SVE callee-save area from which they can
+ // be reloaded. The code below will deallocate the stack space
+ // space by moving FP -> SP.
+ emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::FP,
+ StackOffset::getScalable(-SVECalleeSavedSize), TII,
+ MachineInstr::FrameDestroy);
+ }
+ } else {
+ if (SVECalleeSavedSize) {
+ // Deallocate the non-SVE locals first before we can deallocate (and
+ // restore callee saves) from the SVE area.
+ emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
+ StackOffset::getFixed(NumBytes), TII,
+ MachineInstr::FrameDestroy, false, false, nullptr,
+ EmitCFI && !hasFP(MF),
+ SVEStackSize + StackOffset::getFixed(
+ NumBytes + PrologueSaveSize));
+ NumBytes = 0;
+ }
+
+ emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
+ DeallocateBefore, TII, MachineInstr::FrameDestroy,
+ false, false, nullptr, EmitCFI && !hasFP(MF),
+ SVEStackSize +
+ StackOffset::getFixed(NumBytes + PrologueSaveSize));
+
+ emitFrameOffset(MBB, RestoreEnd, DL, AArch64::SP, AArch64::SP,
+ DeallocateAfter, TII, MachineInstr::FrameDestroy, false,
+ false, nullptr, EmitCFI && !hasFP(MF),
+ DeallocateAfter +
+ StackOffset::getFixed(NumBytes + PrologueSaveSize));
}
+ if (EmitCFI)
+ emitCalleeSavedSVERestores(MBB, RestoreEnd);
+ }
+ } else if (SVEStackSize) {
+ assert(!AFI->isStackRealigned() && !MFI.hasVarSizedObjects() &&
+ "TODO: Support stack realigment / variable-sized objects");
+ // SplitSVEObjects. Determine the sizes and starts/ends of the ZPR and PPR
+ // areas.
+ auto ZPRCalleeSavedSize =
+ StackOffset::getScalable(AFI->getZPRCalleeSavedStackSize());
+ auto PPRCalleeSavedSize =
+ StackOffset::getScalable(AFI->getPPRCalleeSavedStackSize());
+ StackOffset PPRLocalsSize = PPRStackSize - PPRCalleeSavedSize;
+ StackOffset ZPRLocalsSize = ZPRStackSize - ZPRCalleeSavedSize;
+
+ MachineBasicBlock::iterator PPRRestoreBegin = LastPopI,
+ PPRRestoreEnd = LastPopI;
+ if (PPRCalleeSavedSize) {
+ PPRRestoreBegin = std::prev(PPRRestoreEnd);
+ while (PPRRestoreBegin != MBB.begin() &&
+ IsPPRCalleeSave(std::prev(PPRRestoreBegin)))
+ --PPRRestoreBegin;
+ }
+
+ MachineBasicBlock::iterator ZPRRestoreBegin = PPRRestoreBegin,
+ ZPRRestoreEnd = PPRRestoreBegin;
+ if (ZPRCalleeSavedSize) {
+ ZPRRestoreBegin = std::prev(ZPRRestoreEnd);
+ while (ZPRRestoreBegin != MBB.begin() &&
+ IsZPRCalleeSave(std::prev(ZPRRestoreBegin)))
+ --ZPRRestoreBegin;
+ }
+
+ auto CFAOffset =
+ SVEStackSize + StackOffset::getFixed(NumBytes + PrologueSaveSize);
+ if (PPRCalleeSavedSize || ZPRCalleeSavedSize) {
+ // Deallocate the non-SVE locals first before we can deallocate (and
+ // restore callee saves) from the SVE area.
+ auto NonSVELocals = StackOffset::getFixed(NumBytes);
+ emitFrameOffset(MBB, ZPRRestoreBegin, DL, AArch64::SP, AArch64::SP,
+ NonSVELocals, TII, MachineInstr::FrameDestroy, false,
+ false, nullptr, EmitCFI && !hasFP(MF), CFAOffset);
+ NumBytes = 0;
+ CFAOffset -= NonSVELocals;
+ }
+
+ if (ZPRLocalsSize) {
+ emitFrameOffset(MBB, ZPRRestoreBegin, DL, AArch64::SP, AArch64::SP,
+ ZPRLocalsSize, TII, MachineInstr::FrameDestroy, false,
+ false, nullptr, EmitCFI && !hasFP(MF), CFAOffset);
+ CFAOffset -= ZPRLocalsSize;
+ }
- emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
- DeallocateBefore, TII, MachineInstr::FrameDestroy, false,
- false, nullptr, EmitCFI && !hasFP(MF),
- SVEStackSize +
- StackOffset::getFixed(NumBytes + PrologueSaveSize));
-
- emitFrameOffset(MBB, RestoreEnd, DL, AArch64::SP, AArch64::SP,
- DeallocateAfter, TII, MachineInstr::FrameDestroy, false,
- false, nullptr, EmitCFI && !hasFP(MF),
- DeallocateAfter +
- StackOffset::getFixed(NumBytes + PrologueSaveSize));
+ if (PPRLocalsSize || ZPRCalleeSavedSize) {
+ assert(PPRRestoreBegin == ZPRRestoreEnd &&
+ "Expected PPR restores after ZPR");
+ emitFrameOffset(MBB, PPRRestoreBegin, DL, AArch64::SP, AArch64::SP,
+ PPRLocalsSize + ZPRCalleeSavedSize, TII,
+ MachineInstr::FrameDestroy, false, false, nullptr,
+ EmitCFI && !hasFP(MF), CFAOffset);
+ CFAOffset -= PPRLocalsSize + ZPRCalleeSavedSize;
}
+ if (PPRCalleeSavedSize) {
+ emitFrameOffset(MBB, PPRRestoreEnd, DL, AArch64::SP, AArch64::SP,
+ PPRCalleeSavedSize, TII, MachineInstr::FrameDestroy,
+ false, false, nullptr, EmitCFI && !hasFP(MF), CFAOffset);
+ }
+
+ // We only emit CFI information for ZPRs so emit CFI after the ZPR restores.
if (EmitCFI)
- emitCalleeSavedSVERestores(MBB, RestoreEnd);
+ emitCalleeSavedSVERestores(MBB, ZPRRestoreEnd);
}
if (!hasFP(MF)) {
@@ -2660,9 +2786,15 @@ AArch64FrameLowering::getFrameIndexReferenceFromSP(const MachineFunction &MF,
return StackOffset::getFixed(ObjectOffset - getOffsetOfLocalArea());
const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
- if (MFI.isScalableStackID(FI))
- return StackOffset::get(-((int64_t)AFI->getCalleeSavedStackSize()),
+ if (MFI.isScalableStackID(FI)) {
+ StackOffset AccessOffset{};
+ if (AFI->hasSplitSVEObjects() &&
+ MFI.getStackID(FI) == TargetStackID::ScalableVector)
+ AccessOffset = -PPRStackSize;
+ return AccessOffset +
+ StackOffset::get(-((int64_t)AFI->getCalleeSavedStackSize()),
ObjectOffset);
+ }
bool IsFixed = MFI.isFixedObjectIndex(FI);
bool IsCSR =
@@ -2720,12 +2852,12 @@ StackOffset AArch64FrameLowering::resolveFrameIndexReference(
bool isFixed = MFI.isFixedObjectIndex(FI);
bool isSVE = MFI.isScalableStackID(FI);
return resolveFrameOffsetReference(MF, ObjectOffset, isFixed, isSVE, FrameReg,
- PreferFP, ForSimm);
+ PreferFP, ForSimm, FI);
}
StackOffset AArch64FrameLowering::resolveFrameOffsetReference(
const MachineFunction &MF, int64_t ObjectOffset, bool isFixed, bool isSVE,
- Register &FrameReg, bool PreferFP, bool ForSimm) const {
+ Register &FrameReg, bool PreferFP, bool ForSimm, int64_t FI) const {
const auto &MFI = MF.getFrameInfo();
const auto *RegInfo = static_cast<const AArch64RegisterInfo *>(
MF.getSubtarget().getRegisterInfo());
@@ -2737,7 +2869,9 @@ StackOffset AArch64FrameLowering::resolveFrameOffsetReference(
bool isCSR =
!isFixed && ObjectOffset >= -((int)AFI->getCalleeSavedStackSize(MFI));
- const StackOffset SVEStackSize = getSVEStackSize(MF);
+ StackOffset ZPRStackSize = getZPRStackSize(MF);
+ StackOffset PPRStackSize = getPPRStackSize(MF);
+ StackOffset SVEStackSize = ZPRStackSize + PPRStackSize;
// Use frame pointer to reference fixed object...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/142392
More information about the llvm-branch-commits
mailing list