[llvm] [AArch64][SME] Rework VG CFI information for streaming-mode changes (PR #152283)

Wed Aug 6 02:52:24 PDT 2025

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: Benjamin Maxwell (MacDue)

<details>
<summary>Changes</summary>

This patch reworks how VG is handled around streaming mode changes.

Previously, for functions with streaming mode changes, we would:

- Save the incoming VG in the prologue
- Emit `.cfi_offset vg, <offset>` and `.cfi_restore vg` around streaming mode changes

Additionally, for locally streaming functions, we would:

- Also save the streaming VG in the prologue
- Emit `.cfi_offset vg, <incoming VG offset>` in the prologue
- Emit `.cfi_offset vg, <streaming VG offset>` and `.cfi_restore vg` around streaming mode changes

In both cases, this ends up doing more than necessary and would be hard for an unwinder to parse, as using `.cfi_offset` in this way does not follow the semantics of the underlying DWARF CFI opcodes.

So the new scheme in this patch is to:

In functions with streaming mode changes (inc locally streaming)

- Save the incoming VG in the prologue
- Emit `.cfi_offset vg, <offset>` in the prologue (not at streaming mode changes)
- Never emit `.cfi_restore vg` (this is not meaningful for unwinding)
- Explicitly reference the incoming VG expressions for SVE callee-saves in functions with streaming mode changes
- Ensure the CFA is not described in terms of VG in functions with streaming mode changes

A more in-depth discussion of this scheme is available in: https://gist.github.com/MacDue/b7a5c45d131d2440858165bfc903e97b

But the TLDR is that following this scheme, SME unwinding can be implemented with minimal changes to existing unwinders. All unwinders need to do is initialize VG to `CNTD` at the start of unwinding, then everything else is handled by standard opcodes (which don't need changes to handle VG).

---

Patch is 382.23 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/152283.diff


28 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+55-117) 
- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (-13) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+24-5) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (+4-2) 
- (modified) llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h (-10) 
- (modified) llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td (-16) 
- (modified) llvm/lib/Target/AArch64/SMEPeepholeOpt.cpp (+2-19) 
- (modified) llvm/test/CodeGen/AArch64/outlining-with-streaming-mode-changes.ll (+3-9) 
- (modified) llvm/test/CodeGen/AArch64/sme-agnostic-za.ll (+10-18) 
- (modified) llvm/test/CodeGen/AArch64/sme-call-streaming-compatible-to-normal-fn-wihout-sme-attr.ll (+8-14) 
- (modified) llvm/test/CodeGen/AArch64/sme-callee-save-restore-pairs.ll (+16-32) 
- (modified) llvm/test/CodeGen/AArch64/sme-darwin-sve-vg.ll (+18-20) 
- (modified) llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll (+14-37) 
- (modified) llvm/test/CodeGen/AArch64/sme-lazy-save-call.ll (+4-6) 
- (modified) llvm/test/CodeGen/AArch64/sme-must-save-lr-for-vg.ll (+5-6) 
- (modified) llvm/test/CodeGen/AArch64/sme-peephole-opts.ll (+126-59) 
- (modified) llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll (+198-284) 
- (modified) llvm/test/CodeGen/AArch64/sme-streaming-body-streaming-compatible-interface.ll (+13-28) 
- (modified) llvm/test/CodeGen/AArch64/sme-streaming-body.ll (+46-84) 
- (modified) llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll (+40-60) 
- (modified) llvm/test/CodeGen/AArch64/sme-streaming-interface.ll (+17-30) 
- (added) llvm/test/CodeGen/AArch64/sme-streaming-mode-changes-unwindinfo.ll (+308) 
- (modified) llvm/test/CodeGen/AArch64/sme-streaming-mode-changing-call-disable-stackslot-scavenging.ll (+13-15) 
- (modified) llvm/test/CodeGen/AArch64/sme-vg-to-stack.ll (+162-169) 
- (modified) llvm/test/CodeGen/AArch64/ssve-stack-hazard-remarks.ll (+8-8) 
- (modified) llvm/test/CodeGen/AArch64/stack-hazard.ll (+349-344) 
- (modified) llvm/test/CodeGen/AArch64/streaming-compatible-memory-ops.ll (+25-45) 
- (modified) llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll (+60-57) 


``````````diff

diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index 885f2a94f85f5..de9d865465901 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -338,9 +338,11 @@ static bool requiresSaveVG(const MachineFunction &MF);
 // Conservatively, returns true if the function is likely to have an SVE vectors
 // on the stack. This function is safe to be called before callee-saves or
 // object offsets have been determined.
-static bool isLikelyToHaveSVEStack(MachineFunction &MF) {
+static bool isLikelyToHaveSVEStack(const MachineFunction &MF) {
   auto *AFI = MF.getInfo<AArch64FunctionInfo>();
-  if (AFI->isSVECC())
+  if (MF.getFunction().getCallingConv() ==
+          CallingConv::AArch64_SVE_VectorCall ||
+      AFI->isSVECC())
     return true;
 
   if (AFI->hasCalculatedStackSizeSVE())
@@ -532,6 +534,7 @@ bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {
 bool AArch64FrameLowering::hasFPImpl(const MachineFunction &MF) const {
   const MachineFrameInfo &MFI = MF.getFrameInfo();
   const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
+  const AArch64FunctionInfo &AFI = *MF.getInfo<AArch64FunctionInfo>();
 
   // Win64 EH requires a frame pointer if funclets are present, as the locals
   // are accessed off the frame pointer in both the parent function and the
@@ -545,6 +548,16 @@ bool AArch64FrameLowering::hasFPImpl(const MachineFunction &MF) const {
       MFI.hasStackMap() || MFI.hasPatchPoint() ||
       RegInfo->hasStackRealignment(MF))
     return true;
+  // If we have streaming mode changes and SVE registers on the stack we need a
+  // FP. This is as the stack size may depend on the VG at entry to the
+  // function, which is saved before the SVE area (so unrecoverable without a
+  // FP). Similar for locally streaming functions, but it is because we use
+  // ADDSVL to setup the SVE stack (which might not match VG, even without
+  // streaming-mode changes).
+  if (AFI.needsDwarfUnwindInfo(MF) &&
+      ((requiresSaveVG(MF) || AFI.getSMEFnAttrs().hasStreamingBody()) &&
+       (!AFI.hasCalculatedStackSizeSVE() || AFI.getStackSizeSVE() > 0)))
+    return true;
   // With large callframes around we may need to use FP to access the scavenging
   // emergency spillslot.
   //
@@ -663,10 +676,6 @@ void AArch64FrameLowering::emitCalleeSavedGPRLocations(
     MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI) const {
   MachineFunction &MF = *MBB.getParent();
   MachineFrameInfo &MFI = MF.getFrameInfo();
-  AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
-  SMEAttrs Attrs = AFI->getSMEFnAttrs();
-  bool LocallyStreaming =
-      Attrs.hasStreamingBody() && !Attrs.hasStreamingInterface();
 
   const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
   if (CSI.empty())
@@ -680,14 +689,6 @@ void AArch64FrameLowering::emitCalleeSavedGPRLocations(
 
     assert(!Info.isSpilledToReg() && "Spilling to registers not implemented");
     int64_t Offset = MFI.getObjectOffset(FrameIdx) - getOffsetOfLocalArea();
-
-    // The location of VG will be emitted before each streaming-mode change in
-    // the function. Only locally-streaming functions require emitting the
-    // non-streaming VG location here.
-    if ((LocallyStreaming && FrameIdx == AFI->getStreamingVGIdx()) ||
-        (!LocallyStreaming && Info.getReg() == AArch64::VG))
-      continue;
-
     CFIBuilder.buildOffset(Info.getReg(), Offset);
   }
 }
@@ -707,8 +708,16 @@ void AArch64FrameLowering::emitCalleeSavedSVELocations(
   AArch64FunctionInfo &AFI = *MF.getInfo<AArch64FunctionInfo>();
   CFIInstBuilder CFIBuilder(MBB, MBBI, MachineInstr::FrameSetup);
 
+  std::optional<int64_t> IncomingVGOffsetFromDefCFA;
+  if (requiresSaveVG(MF)) {
+    auto IncomingVG = *find_if(
+        reverse(CSI), [](auto &Info) { return Info.getReg() == AArch64::VG; });
+    IncomingVGOffsetFromDefCFA =
+        MFI.getObjectOffset(IncomingVG.getFrameIdx()) - getOffsetOfLocalArea();
+  }
+
   for (const auto &Info : CSI) {
-    if (!(MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector))
+    if (MFI.getStackID(Info.getFrameIdx()) != TargetStackID::ScalableVector)
       continue;
 
     // Not all unwinders may know about SVE registers, so assume the lowest
@@ -722,7 +731,8 @@ void AArch64FrameLowering::emitCalleeSavedSVELocations(
         StackOffset::getScalable(MFI.getObjectOffset(Info.getFrameIdx())) -
         StackOffset::getFixed(AFI.getCalleeSavedStackSize(MFI));
 
-    CFIBuilder.insertCFIInst(createCFAOffset(TRI, Reg, Offset));
+    CFIBuilder.insertCFIInst(
+        createCFAOffset(TRI, Reg, Offset, IncomingVGOffsetFromDefCFA));
   }
 }
 
@@ -1465,10 +1475,10 @@ bool requiresGetVGCall(MachineFunction &MF) {
 
 static bool requiresSaveVG(const MachineFunction &MF) {
   const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
+  if (!AFI->needsDwarfUnwindInfo(MF) || !AFI->hasStreamingModeChanges())
+    return false;
   // For Darwin platforms we don't save VG for non-SVE functions, even if SME
   // is enabled with streaming mode changes.
-  if (!AFI->hasStreamingModeChanges())
-    return false;
   auto &ST = MF.getSubtarget<AArch64Subtarget>();
   if (ST.isTargetDarwin())
     return ST.hasSVE();
@@ -1477,8 +1487,7 @@ static bool requiresSaveVG(const MachineFunction &MF) {
 
 bool isVGInstruction(MachineBasicBlock::iterator MBBI) {
   unsigned Opc = MBBI->getOpcode();
-  if (Opc == AArch64::CNTD_XPiI || Opc == AArch64::RDSVLI_XI ||
-      Opc == AArch64::UBFMXri)
+  if (Opc == AArch64::CNTD_XPiI)
     return true;
 
   if (requiresGetVGCall(*MBBI->getMF())) {
@@ -1507,9 +1516,8 @@ static MachineBasicBlock::iterator convertCalleeSaveRestoreToSPPrePostIncDec(
   unsigned NewOpc;
 
   // If the function contains streaming mode changes, we expect instructions
-  // to calculate the value of VG before spilling. For locally-streaming
-  // functions, we need to do this for both the streaming and non-streaming
-  // vector length. Move past these instructions if necessary.
+  // to calculate the value of VG before spilling. Move past these instructions
+  // if necessary.
   MachineFunction &MF = *MBB.getParent();
   if (requiresSaveVG(MF))
     while (isVGInstruction(MBBI))
@@ -3469,7 +3477,6 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters(
     ArrayRef<CalleeSavedInfo> CSI, const TargetRegisterInfo *TRI) const {
   MachineFunction &MF = *MBB.getParent();
   const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
-  AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
   bool NeedsWinCFI = needsWinCFI(MF);
   DebugLoc DL;
   SmallVector<RegPairInfo, 8> RegPairs;
@@ -3538,40 +3545,31 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters(
     }
 
     unsigned X0Scratch = AArch64::NoRegister;
+    auto RestoreX0 = make_scope_exit([&] {
+      if (X0Scratch != AArch64::NoRegister)
+        BuildMI(MBB, MI, DL, TII.get(AArch64::ORRXrr), AArch64::X0)
+            .addReg(AArch64::XZR)
+            .addReg(X0Scratch, RegState::Undef)
+            .addReg(X0Scratch, RegState::Implicit)
+            .setMIFlag(MachineInstr::FrameSetup);
+    });
+
     if (Reg1 == AArch64::VG) {
       // Find an available register to store value of VG to.
       Reg1 = findScratchNonCalleeSaveRegister(&MBB, true);
       assert(Reg1 != AArch64::NoRegister);
-      SMEAttrs Attrs = AFI->getSMEFnAttrs();
-
-      if (Attrs.hasStreamingBody() && !Attrs.hasStreamingInterface() &&
-          AFI->getStreamingVGIdx() == std::numeric_limits<int>::max()) {
-        // For locally-streaming functions, we need to store both the streaming
-        // & non-streaming VG. Spill the streaming value first.
-        BuildMI(MBB, MI, DL, TII.get(AArch64::RDSVLI_XI), Reg1)
-            .addImm(1)
-            .setMIFlag(MachineInstr::FrameSetup);
-        BuildMI(MBB, MI, DL, TII.get(AArch64::UBFMXri), Reg1)
-            .addReg(Reg1)
-            .addImm(3)
-            .addImm(63)
-            .setMIFlag(MachineInstr::FrameSetup);
-
-        AFI->setStreamingVGIdx(RPI.FrameIdx);
-      } else if (MF.getSubtarget<AArch64Subtarget>().hasSVE()) {
+      if (MF.getSubtarget<AArch64Subtarget>().hasSVE()) {
         BuildMI(MBB, MI, DL, TII.get(AArch64::CNTD_XPiI), Reg1)
             .addImm(31)
             .addImm(1)
             .setMIFlag(MachineInstr::FrameSetup);
-        AFI->setVGIdx(RPI.FrameIdx);
       } else {
         const AArch64Subtarget &STI = MF.getSubtarget<AArch64Subtarget>();
-        if (llvm::any_of(
-                MBB.liveins(),
-                [&STI](const MachineBasicBlock::RegisterMaskPair &LiveIn) {
-                  return STI.getRegisterInfo()->isSuperOrSubRegisterEq(
-                      AArch64::X0, LiveIn.PhysReg);
-                }))
+        if (any_of(MBB.liveins(),
+                   [&STI](const MachineBasicBlock::RegisterMaskPair &LiveIn) {
+                     return STI.getRegisterInfo()->isSuperOrSubRegisterEq(
+                         AArch64::X0, LiveIn.PhysReg);
+                   }))
           X0Scratch = Reg1;
 
         if (X0Scratch != AArch64::NoRegister)
@@ -3590,7 +3588,6 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters(
             .addReg(AArch64::X0, RegState::ImplicitDefine)
             .setMIFlag(MachineInstr::FrameSetup);
         Reg1 = AArch64::X0;
-        AFI->setVGIdx(RPI.FrameIdx);
       }
     }
 
@@ -3685,13 +3682,6 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters(
       if (RPI.isPaired())
         MFI.setStackID(FrameIdxReg2, TargetStackID::ScalableVector);
     }
-
-    if (X0Scratch != AArch64::NoRegister)
-      BuildMI(MBB, MI, DL, TII.get(AArch64::ORRXrr), AArch64::X0)
-          .addReg(AArch64::XZR)
-          .addReg(X0Scratch, RegState::Undef)
-          .addReg(X0Scratch, RegState::Implicit)
-          .setMIFlag(MachineInstr::FrameSetup);
   }
   return true;
 }
@@ -4070,15 +4060,8 @@ void AArch64FrameLowering::determineCalleeSaves(MachineFunction &MF,
 
   // Increase the callee-saved stack size if the function has streaming mode
   // changes, as we will need to spill the value of the VG register.
-  // For locally streaming functions, we spill both the streaming and
-  // non-streaming VG value.
-  SMEAttrs Attrs = AFI->getSMEFnAttrs();
-  if (requiresSaveVG(MF)) {
-    if (Attrs.hasStreamingBody() && !Attrs.hasStreamingInterface())
-      CSStackSize += 16;
-    else
-      CSStackSize += 8;
-  }
+  if (requiresSaveVG(MF))
+    CSStackSize += 8;
 
   // Determine if a Hazard slot should be used, and increase the CSStackSize by
   // StackHazardSize if so.
@@ -4229,29 +4212,19 @@ bool AArch64FrameLowering::assignCalleeSavedSpillSlots(
 
   // Insert VG into the list of CSRs, immediately before LR if saved.
   if (requiresSaveVG(MF)) {
-    std::vector<CalleeSavedInfo> VGSaves;
-    SMEAttrs Attrs = AFI->getSMEFnAttrs();
-
-    auto VGInfo = CalleeSavedInfo(AArch64::VG);
+    CalleeSavedInfo VGInfo(AArch64::VG);
     VGInfo.setRestored(false);
-    VGSaves.push_back(VGInfo);
-
-    // Add VG again if the function is locally-streaming, as we will spill two
-    // values.
-    if (Attrs.hasStreamingBody() && !Attrs.hasStreamingInterface())
-      VGSaves.push_back(VGInfo);
-
-    bool InsertBeforeLR = false;
 
+    bool InsertedBeforeLR = false;
     for (unsigned I = 0; I < CSI.size(); I++)
       if (CSI[I].getReg() == AArch64::LR) {
-        InsertBeforeLR = true;
-        CSI.insert(CSI.begin() + I, VGSaves.begin(), VGSaves.end());
+        InsertedBeforeLR = true;
+        CSI.insert(CSI.begin() + I, VGInfo);
         break;
       }
 
-    if (!InsertBeforeLR)
-      llvm::append_range(CSI, VGSaves);
+    if (!InsertedBeforeLR)
+      CSI.push_back(VGInfo);
   }
 
   Register LastReg = 0;
@@ -5254,46 +5227,11 @@ MachineBasicBlock::iterator tryMergeAdjacentSTG(MachineBasicBlock::iterator II,
 }
 } // namespace
 
-static void emitVGSaveRestore(MachineBasicBlock::iterator II,
-                              const AArch64FrameLowering *TFI) {
-  MachineInstr &MI = *II;
-  MachineBasicBlock *MBB = MI.getParent();
-  MachineFunction *MF = MBB->getParent();
-
-  if (MI.getOpcode() != AArch64::VGSavePseudo &&
-      MI.getOpcode() != AArch64::VGRestorePseudo)
-    return;
-
-  auto *AFI = MF->getInfo<AArch64FunctionInfo>();
-  SMEAttrs FuncAttrs = AFI->getSMEFnAttrs();
-  bool LocallyStreaming =
-      FuncAttrs.hasStreamingBody() && !FuncAttrs.hasStreamingInterface();
-
-  int64_t VGFrameIdx =
-      LocallyStreaming ? AFI->getStreamingVGIdx() : AFI->getVGIdx();
-  assert(VGFrameIdx != std::numeric_limits<int>::max() &&
-         "Expected FrameIdx for VG");
-
-  CFIInstBuilder CFIBuilder(*MBB, II, MachineInstr::NoFlags);
-  if (MI.getOpcode() == AArch64::VGSavePseudo) {
-    const MachineFrameInfo &MFI = MF->getFrameInfo();
-    int64_t Offset =
-        MFI.getObjectOffset(VGFrameIdx) - TFI->getOffsetOfLocalArea();
-    CFIBuilder.buildOffset(AArch64::VG, Offset);
-  } else {
-    CFIBuilder.buildRestore(AArch64::VG);
-  }
-
-  MI.eraseFromParent();
-}
-
 void AArch64FrameLowering::processFunctionBeforeFrameIndicesReplaced(
     MachineFunction &MF, RegScavenger *RS = nullptr) const {
   for (auto &BB : MF)
     for (MachineBasicBlock::iterator II = BB.begin(); II != BB.end();) {
-      if (requiresSaveVG(MF))
-        emitVGSaveRestore(II++, this);
-      else if (StackTaggingMergeSetTag)
+      if (StackTaggingMergeSetTag)
         II = tryMergeAdjacentSTG(II, this, RS);
     }
 
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 018c16d61b12d..bf85e887df907 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -9441,12 +9441,6 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
 
   SDValue InGlue;
   if (RequiresSMChange) {
-    if (!Subtarget->isTargetDarwin() || Subtarget->hasSVE()) {
-      Chain = DAG.getNode(AArch64ISD::VG_SAVE, DL,
-                          DAG.getVTList(MVT::Other, MVT::Glue), Chain);
-      InGlue = Chain.getValue(1);
-    }
-
     SDValue NewChain = changeStreamingMode(
         DAG, DL, CallAttrs.callee().hasStreamingInterface(), Chain, InGlue,
         getSMToggleCondition(CallAttrs), PStateSM);
@@ -9637,13 +9631,6 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
     Result = changeStreamingMode(
         DAG, DL, !CallAttrs.callee().hasStreamingInterface(), Result, InGlue,
         getSMToggleCondition(CallAttrs), PStateSM);
-
-    if (!Subtarget->isTargetDarwin() || Subtarget->hasSVE()) {
-      InGlue = Result.getValue(1);
-      Result =
-          DAG.getNode(AArch64ISD::VG_RESTORE, DL,
-                      DAG.getVTList(MVT::Other, MVT::Glue), {Result, InGlue});
-    }
   }
 
   if (CallAttrs.requiresEnablingZAAfterCall())
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 98ebd512b0b75..d602ccf145b3b 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -5888,6 +5888,18 @@ static void appendReadRegExpr(SmallVectorImpl<char> &Expr, unsigned RegNum) {
   Expr.push_back(0);
 }
 
+// Convenience function to create a DWARF expression for loading a register from
+// a CFA offset.
+static void appendLoadRegExpr(SmallVectorImpl<char> &Expr,
+                              int64_t OffsetFromDefCFA) {
+  // This assumes the top of the DWARF stack contains the CFA.
+  Expr.push_back(dwarf::DW_OP_dup);
+  // Add the offset to the register.
+  appendConstantExpr(Expr, OffsetFromDefCFA, dwarf::DW_OP_plus);
+  // Dereference the address (loads a 64 bit value)..
+  Expr.push_back(dwarf::DW_OP_deref);
+}
+
 // Convenience function to create a comment for
 //  (+/-) NumBytes (* RegScale)?
 static void appendOffsetComment(int NumBytes, llvm::raw_string_ostream &Comment,
@@ -5956,9 +5968,10 @@ MCCFIInstruction llvm::createDefCFA(const TargetRegisterInfo &TRI,
   return MCCFIInstruction::cfiDefCfa(nullptr, DwarfReg, (int)Offset.getFixed());
 }
 
-MCCFIInstruction llvm::createCFAOffset(const TargetRegisterInfo &TRI,
-                                       unsigned Reg,
-                                       const StackOffset &OffsetFromDefCFA) {
+MCCFIInstruction
+llvm::createCFAOffset(const TargetRegisterInfo &TRI, unsigned Reg,
+                      const StackOffset &OffsetFromDefCFA,
+                      std::optional<int64_t> IncomingVGOffsetFromDefCFA) {
   int64_t NumBytes, NumVGScaledBytes;
   AArch64InstrInfo::decomposeStackOffsetForDwarfOffsets(
       OffsetFromDefCFA, NumBytes, NumVGScaledBytes);
@@ -5977,9 +5990,15 @@ MCCFIInstruction llvm::createCFAOffset(const TargetRegisterInfo &TRI,
   assert(NumVGScaledBytes && "Expected scalable offset");
   SmallString<64> OffsetExpr;
   // + VG * NumVGScaledBytes
-  appendOffsetComment(NumVGScaledBytes, Comment, "* VG");
-  appendReadRegExpr(OffsetExpr, TRI.getDwarfRegNum(AArch64::VG, true));
+  StringRef VGRegScale("* VG");
+  if (IncomingVGOffsetFromDefCFA) {
+    appendLoadRegExpr(OffsetExpr, *IncomingVGOffsetFromDefCFA);
+    VGRegScale = "* IncomingVG";
+  } else {
+    appendReadRegExpr(OffsetExpr, TRI.getDwarfRegNum(AArch64::VG, true));
+  }
   appendConstantExpr(OffsetExpr, NumVGScaledBytes, dwarf::DW_OP_mul);
+  appendOffsetComment(NumVGScaledBytes, Comment, VGRegScale);
   OffsetExpr.push_back(dwarf::DW_OP_plus);
   if (NumBytes) {
     // + NumBytes
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index 7c255da333e4b..6abd18fd2e52f 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -642,8 +642,10 @@ bool isNZCVTouchedInInstructionRange(const MachineInstr &DefMI,
 MCCFIInstruction createDefCFA(const TargetRegisterInfo &TRI, unsigned FrameReg,
                               unsigned Reg, const StackOffset &Offset,
                               bool LastAdjustmentWasScalable = true);
-MCCFIInstruction createCFAOffset(const TargetRegisterInfo &MRI, unsigned Reg,
-                                 const StackOffset &OffsetFromDefCFA);
+MCCFIInstruction
+createCFAOffset(const TargetRegisterInfo &MRI, unsigned Reg,
+                const StackOffset &OffsetFromDefCFA,
+                std::optional<int64_t> IncomingVGOffsetFromDefCFA);
 
 /// emitFrameOffset - Emit instructions as needed to set DestReg to SrcReg
 /// plus Offset.  This is intended to be used from within the prolog/epilog
diff --git a/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h b/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
index 800787cc0b4f5..0f04b740dbe22 100644
--- a/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
@@ -243,10 +243,6 @@ class AArch64FunctionInfo final : public MachineFunctionInfo {
   // The PTRUE is used for the LD/ST of ZReg pairs in save and restore.
   unsigned PredicateRegForFillSpill = 0;
 
-  // The stack slots where VG values are stored to.
-  int64_t VGIdx = std::numeric_limits<int>::max();
-  int64_t StreamingVGIdx = std::numeric_limits<int>::max();
-
   // Holds the SME function attributes (streaming mode, ZA/ZT0 state).
   SMEAttrs SMEFnAttrs;
 
@@ -274,12 +270,6 @@ class AArch64FunctionInfo final : public MachineFunctionInfo {
   Register getPStateSMReg() const { return PStateSMReg; };
   void setPStateSMReg(Register Reg) { PStateSMReg = Reg; };
 
-  int64_t getVGIdx() const { return VGIdx; };
-  void setVGIdx(unsigned Idx) { VGIdx = Idx; };
-
-  int64_t getStreamingVGIdx() const { return StreamingVGIdx; };
-  void setStreamingVGIdx(unsigned FrameIdx) { StreamingVGIdx = FrameIdx; };
-
   bool isSVECC() const { return IsSVECC; };
   void setIsSVECC(bool s) { IsSVECC = s; };
 
diff --git a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
index db27ca978980f..86bdc8f6e2966 100644
--- a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
@@ -39,12 +39,6 @@ def AArch64_save_zt : SDNode<"AArch64ISD::SAVE_ZT", SDTypeProfile<0, 2,
 def AArch64CoalescerBarrier
     : SDNode<"AArch64ISD::COALESCER_BARRIER", SDTypeProfile<1, 1, []>, [SDNPOptInGlue, SDNPOutGlue]>;
 
-def AArch64VGSave : SDNode<"AArch64ISD::VG_SAVE", SDTypeProfi...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/152283