[libunwind] [llvm] [LFI][AArch64] Add AArch64 LFI rewriter (PR #184277)

Zachary Yedidia via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 3 01:27:52 PST 2026


https://github.com/zyedidia updated https://github.com/llvm/llvm-project/pull/184277

>From d26b504d9b8dc6233f9040a09e9f3346e4610434 Mon Sep 17 00:00:00 2001
From: Zachary Yedidia <zyedidia at gmail.com>
Date: Mon, 2 Mar 2026 19:27:15 -0500
Subject: [PATCH 1/2] [LFI][AArch64] Add AArch64 LFI rewriter

This is the third patch in the LFI series, adding the AArch64-specific
MCLFIRewriter implementation. The rewriter performs instruction-level
sandboxing at the MC layer to enforce Lightweight Fault Isolation
guarantees.

The rewriter handles the following categories of instructions:

* Memory accesses (loads/stores): rewritten to use sandboxed addressing
  via the base register (x27) with RoW (register offset with UXTW)
  optimization when possible, falling back to a two-instruction
  guard+access sequence.
* Stack pointer modifications: redirected through a scratch register
  (x26) and then sandboxed back into SP.
* Link register modifications: deferred guard emission until the next
  control flow instruction for PAC compatibility.
* Indirect branches, calls, and returns: target addresses are sandboxed.
* PAC authenticated branches/returns: expanded to their component
  operations (authenticate + guard + branch).
* System instructions: SVC, MRS/MSR TPIDR_EL0, and DC ZVA are rewritten
  to use LFI conventions.
* Pre/post-index addressing: decomposed into base access + separate
  offset update.

Additional features:

* Guard elimination optimization that avoids redundant sandboxing when
  consecutive memory accesses use the same base register.
* TSFlags-based MemOpAddrMode annotations for classifying instruction
  addressing modes, used by the rewriter to determine how to sandbox each
  instruction.
* Configurable sandboxing modes: +no-lfi-loads (stores-only) and
  +no-lfi-loads,+no-lfi-stores (jumps-only).
* Documentation updates covering the rewriting rules, context register
  layout, and all LFI conventions.
---
 libunwind/src/DwarfInstructions.hpp           |    2 +-
 libunwind/src/UnwindRegistersRestore.S        |    6 +
 llvm/docs/LFI.rst                             |  259 ++-
 llvm/include/llvm/MC/MCLFIRewriter.h          |    6 +-
 llvm/lib/MC/MCAsmStreamer.cpp                 |    3 +
 llvm/lib/MC/MCLFIRewriter.cpp                 |    4 +
 llvm/lib/MC/MCObjectStreamer.cpp              |    3 +
 llvm/lib/MC/MCStreamer.cpp                    |    2 +-
 llvm/lib/Target/AArch64/AArch64Features.td    |    8 +-
 .../lib/Target/AArch64/AArch64InstrFormats.td |  183 +-
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp  |   14 +-
 .../Target/AArch64/AArch64TargetMachine.cpp   |    5 +
 .../MCTargetDesc/AArch64MCLFIRewriter.cpp     | 2070 +++++++++++++++++
 .../MCTargetDesc/AArch64MCLFIRewriter.h       |  144 ++
 .../MCTargetDesc/AArch64MCTargetDesc.cpp      |   15 +
 .../AArch64/MCTargetDesc/CMakeLists.txt       |    1 +
 llvm/lib/Target/AArch64/SMEInstrFormats.td    |    5 +
 .../Target/AArch64/Utils/AArch64BaseInfo.h    |   69 +
 llvm/test/MC/AArch64/LFI/branch.s             |   20 +
 llvm/test/MC/AArch64/LFI/exclusive.s          |  140 ++
 llvm/test/MC/AArch64/LFI/fp.s                 |  204 ++
 llvm/test/MC/AArch64/LFI/guard-elim.s         |  149 ++
 llvm/test/MC/AArch64/LFI/jumps-only.s         |   41 +
 llvm/test/MC/AArch64/LFI/literal.s            |   32 +
 llvm/test/MC/AArch64/LFI/lse.s                |  166 ++
 llvm/test/MC/AArch64/LFI/mem.s                |  437 ++++
 llvm/test/MC/AArch64/LFI/no-lfi-loads.s       |   33 +
 llvm/test/MC/AArch64/LFI/other.s              |    6 +
 llvm/test/MC/AArch64/LFI/pac.s                |   55 +
 llvm/test/MC/AArch64/LFI/prefetch.s           |   81 +
 llvm/test/MC/AArch64/LFI/rcpc.s               |   19 +
 llvm/test/MC/AArch64/LFI/reserved.s           |   45 +
 llvm/test/MC/AArch64/LFI/return.s             |   72 +
 llvm/test/MC/AArch64/LFI/simd.s               |  472 ++++
 llvm/test/MC/AArch64/LFI/stack.s              |   37 +
 llvm/test/MC/AArch64/LFI/sys.s                |   15 +
 llvm/test/MC/AArch64/LFI/tls-reg.s            |   13 +
 .../test/MC/AArch64/LFI/unsupported/literal.s |   26 +
 llvm/test/MC/AArch64/LFI/unsupported/pac.s    |   13 +
 llvm/unittests/Target/AArch64/CMakeLists.txt  |    1 +
 .../Target/AArch64/MemOpAddrModeTest.cpp      |  158 ++
 41 files changed, 4953 insertions(+), 81 deletions(-)
 create mode 100644 llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.cpp
 create mode 100644 llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.h
 create mode 100644 llvm/test/MC/AArch64/LFI/branch.s
 create mode 100644 llvm/test/MC/AArch64/LFI/exclusive.s
 create mode 100644 llvm/test/MC/AArch64/LFI/fp.s
 create mode 100644 llvm/test/MC/AArch64/LFI/guard-elim.s
 create mode 100644 llvm/test/MC/AArch64/LFI/jumps-only.s
 create mode 100644 llvm/test/MC/AArch64/LFI/literal.s
 create mode 100644 llvm/test/MC/AArch64/LFI/lse.s
 create mode 100644 llvm/test/MC/AArch64/LFI/mem.s
 create mode 100644 llvm/test/MC/AArch64/LFI/no-lfi-loads.s
 create mode 100644 llvm/test/MC/AArch64/LFI/other.s
 create mode 100644 llvm/test/MC/AArch64/LFI/pac.s
 create mode 100644 llvm/test/MC/AArch64/LFI/prefetch.s
 create mode 100644 llvm/test/MC/AArch64/LFI/rcpc.s
 create mode 100644 llvm/test/MC/AArch64/LFI/reserved.s
 create mode 100644 llvm/test/MC/AArch64/LFI/return.s
 create mode 100644 llvm/test/MC/AArch64/LFI/simd.s
 create mode 100644 llvm/test/MC/AArch64/LFI/stack.s
 create mode 100644 llvm/test/MC/AArch64/LFI/sys.s
 create mode 100644 llvm/test/MC/AArch64/LFI/tls-reg.s
 create mode 100644 llvm/test/MC/AArch64/LFI/unsupported/literal.s
 create mode 100644 llvm/test/MC/AArch64/LFI/unsupported/pac.s
 create mode 100644 llvm/unittests/Target/AArch64/MemOpAddrModeTest.cpp

diff --git a/libunwind/src/DwarfInstructions.hpp b/libunwind/src/DwarfInstructions.hpp
index 165c4a99e9a92..32bde2e04ce03 100644
--- a/libunwind/src/DwarfInstructions.hpp
+++ b/libunwind/src/DwarfInstructions.hpp
@@ -226,7 +226,7 @@ int DwarfInstructions<A, R>::stepWithDwarf(
       // __unw_step_stage2 is not used for cross unwinding, so we use
       // __aarch64__ rather than LIBUNWIND_TARGET_AARCH64 to make sure we are
       // building for AArch64 natively.
-#if defined(__aarch64__)
+#if defined(__aarch64__) && !defined(__LFI__)
       if (stage2 && cieInfo.mteTaggedFrame) {
         pint_t sp = registers.getSP();
         pint_t p = sp;
diff --git a/libunwind/src/UnwindRegistersRestore.S b/libunwind/src/UnwindRegistersRestore.S
index 76a80344034f7..a700ed7ce9f47 100644
--- a/libunwind/src/UnwindRegistersRestore.S
+++ b/libunwind/src/UnwindRegistersRestore.S
@@ -678,9 +678,15 @@ DEFINE_LIBUNWIND_FUNCTION(__libunwind_Registers_arm64_jumpto)
   ldp    x18,x19, [x0, #0x090]
   ldp    x20,x21, [x0, #0x0A0]
   ldp    x22,x23, [x0, #0x0B0]
+#ifndef __LFI__
   ldp    x24,x25, [x0, #0x0C0]
   ldp    x26,x27, [x0, #0x0D0]
   ldp    x28,x29, [x0, #0x0E0]
+#else
+  ldp    x24,xzr, [x0, #0x0C0]
+  ldp    x26,xzr, [x0, #0x0D0]
+  ldp    xzr,x29, [x0, #0x0E0]
+#endif
 
 #if defined(__ARM_FP) && __ARM_FP != 0
   ldp    d0, d1,  [x0, #0x110]
diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst
index 65d8b70f17e0b..58542266388fe 100644
--- a/llvm/docs/LFI.rst
+++ b/llvm/docs/LFI.rst
@@ -63,15 +63,15 @@ to be applied to hand-written assembly, including inline assembly.
 Compiler Options
 ================
 
-The LFI target has several configuration options.
+The LFI target has several configuration options, specified via ``-mattr=``:
 
-* ``+lfi-loads``: enable sandboxing for loads (default: true).
-* ``+lfi-stores``: enable sandboxing for stores (default: true).
+* ``+no-lfi-loads``: Disable sandboxing for load instructions (stores-only mode).
+* ``+no-lfi-stores``: Disable sandboxing for store instructions.
 
-Use ``+nolfi-loads`` to create a "stores-only" sandbox that may read, but not
+Use ``+no-lfi-loads`` to create a "stores-only" sandbox that may read, but not
 write, outside the sandbox region.
 
-Use ``+nolfi-loads+nolfi-stores`` to create a "jumps-only" sandbox that may
+Use ``+no-lfi-loads,+no-lfi-stores`` to create a "jumps-only" sandbox that may
 read/write outside the sandbox region but may not transfer control outside
 (e.g., may not execute system calls directly). This is primarily useful in
 combination with some other form of memory sandboxing, such as Intel MPK.
@@ -88,7 +88,23 @@ that must be maintained.
 * ``sp``: always holds an address within the sandbox.
 * ``x30``: always holds an address within the sandbox.
 * ``x26``: scratch register.
-* ``x25``: points to a thread-local virtual register file for storing runtime context information.
+* ``x25``: context register (see below).
+
+Context Register
+~~~~~~~~~~~~~~~~
+
+The context register (``x25``) points to a block of thread-local memory managed
+by the LFI runtime. The layout is as follows:
+
++--------+--------+----------------------------------------------+
+| Offset | Size   | Description                                  |
++--------+--------+----------------------------------------------+
+| 0      | 8      | Reserved for use by the LFI runtime.         |
++--------+--------+----------------------------------------------+
+| 8      | 24     | Reserved for future use.                     |
++--------+--------+----------------------------------------------+
+| 32     | 8      | Virtual thread pointer (used for TLS access).|
++--------+--------+----------------------------------------------+
 
 Linker Support
 ==============
@@ -240,73 +256,178 @@ before moving it back into ``sp`` with a safe ``add``.
 Link register modification
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-When the link register is modified, we write the modified value to a
-temporary, before loading it back into ``x30`` with a safe ``add``.
-
-+-----------------------+----------------------------+
-|       Original        |         Rewritten          |
-+-----------------------+----------------------------+
-| .. code-block::       | .. code-block::            |
-|                       |                            |
-|    ldr x30, [...]     |    ldr x26, [...]          |
-|                       |    add x30, x27, w26, uxtw |
-|                       |                            |
-+-----------------------+----------------------------+
-| .. code-block::       | .. code-block::            |
-|                       |                            |
-|    ldp xN, x30, [...] |    ldp xN, x26, [...]      |
-|                       |    add x30, x27, w26, uxtw |
-|                       |                            |
-+-----------------------+----------------------------+
-| .. code-block::       | .. code-block::            |
-|                       |                            |
-|    ldp x30, xN, [...] |    ldp x26, xN, [...]      |
-|                       |    add x30, x27, w26, uxtw |
-|                       |                            |
-+-----------------------+----------------------------+
+When the link register is modified, the guard is deferred until the next
+control flow instruction. This approach maintains compatibility with Pointer
+Authentication Code (PAC) instructions by keeping signed pointers intact until
+they are needed for control flow. The guard uses ``x30`` as both the source and
+destination (``add x30, x27, w30, uxtw``).
+
++---------------------------+-------------------------------+
+|         Original          |           Rewritten           |
++---------------------------+-------------------------------+
+| .. code-block::           | .. code-block::               |
+|                           |                               |
+|    ldr x30, [...]         |    ldr x30, [...]             |
+|    ret                    |    add x30, x27, w30, uxtw    |
+|                           |    ret                        |
+|                           |                               |
++---------------------------+-------------------------------+
+| .. code-block::           | .. code-block::               |
+|                           |                               |
+|    ldp xN, x30, [...]     |    ldp xN, x30, [...]         |
+|    ret                    |    add x30, x27, w30, uxtw    |
+|                           |    ret                        |
+|                           |                               |
++---------------------------+-------------------------------+
+
+Pointer Authentication Code (PAC) support
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+LFI is designed to be compatible with ARM Pointer Authentication Code (PAC)
+instructions. PAC signs and authenticates pointers (typically the return
+address in ``x30``) to protect against control-flow hijacking attacks.
+
+To get the security benefits of PAC with LFI-compiled code, the hardware must
+support **FEAT_FPAC** (Faulting PAC), which causes authentication failures to
+immediately fault. Without FEAT_FPAC, a failed authentication produces a
+"poisoned" pointer that only faults when dereferenced, which may not provide
+immediate detection of authentication failures.
+
+**PACIASP** (sign return address) passes through unchanged. It signs the
+current value of ``x30`` using the stack pointer as a modifier, which does not
+affect LFI's security guarantees.
+
+**AUTIASP** (authenticate return address) passes through unchanged. On
+processors with FEAT_FPAC, authentication failure automatically faults.
+
++-------------------+------------------------+
+|     Original      |       Rewritten        |
++-------------------+------------------------+
+| .. code-block::   | .. code-block::        |
+|                   |                        |
+|    paciasp        |    paciasp             |
+|                   |                        |
++-------------------+------------------------+
+| .. code-block::   | .. code-block::        |
+|                   |                        |
+|    autiasp        |    autiasp             |
+|                   |                        |
++-------------------+------------------------+
+
+Note that the deferred LR guard approach is essential for PAC compatibility.
+If the guard were applied immediately after loading a signed return address,
+it would corrupt the PAC signature, causing subsequent ``autiasp`` to fail.
+By deferring the guard until control flow, signed pointers remain intact
+through the authentication process.
+
+**Authenticated returns** (``retaa``/``retab``) combine authentication with
+return. LFI expands these into their component operations:
+
++-------------------+-------------------------------+
+|     Original      |           Rewritten           |
++-------------------+-------------------------------+
+| .. code-block::   | .. code-block::               |
+|                   |                               |
+|    retaa          |    autiasp                    |
+|                   |    add x30, x27, w30, uxtw    |
+|                   |    ret                        |
+|                   |                               |
++-------------------+-------------------------------+
+| .. code-block::   | .. code-block::               |
+|                   |                               |
+|    retab          |    autibsp                    |
+|                   |    add x30, x27, w30, uxtw    |
+|                   |    ret                        |
+|                   |                               |
++-------------------+-------------------------------+
+
+**Authenticated branches** (``braa``/``brab``/``braaz``/``brabz``) combine
+authentication with indirect branch. LFI expands these by first authenticating
+the target register, then performing a normal sandboxed branch:
+
++-------------------+-------------------------------+
+|     Original      |           Rewritten           |
++-------------------+-------------------------------+
+| .. code-block::   | .. code-block::               |
+|                   |                               |
+|    braa xN, xM    |    autia xN, xM               |
+|                   |    add x28, x27, wN, uxtw     |
+|                   |    br x28                     |
+|                   |                               |
++-------------------+-------------------------------+
+| .. code-block::   | .. code-block::               |
+|                   |                               |
+|    braaz xN       |    autiza xN                  |
+|                   |    add x28, x27, wN, uxtw     |
+|                   |    br x28                     |
+|                   |                               |
++-------------------+-------------------------------+
+
+**Authenticated calls** (``blraa``/``blrab``/``blraaz``/``blrabz``) are
+expanded similarly:
+
++-------------------+-------------------------------+
+|     Original      |           Rewritten           |
++-------------------+-------------------------------+
+| .. code-block::   | .. code-block::               |
+|                   |                               |
+|    blraa xN, xM   |    autia xN, xM               |
+|                   |    add x28, x27, wN, uxtw     |
+|                   |    blr x28                    |
+|                   |                               |
++-------------------+-------------------------------+
+| .. code-block::   | .. code-block::               |
+|                   |                               |
+|    blraaz xN      |    autiza xN                  |
+|                   |    add x28, x27, wN, uxtw     |
+|                   |    blr x28                    |
+|                   |                               |
++-------------------+-------------------------------+
+
+**Authenticated exception returns** (``eretaa``/``eretab``) are not supported
+by LFI and will produce an error.
 
 System instructions
 ~~~~~~~~~~~~~~~~~~~
 
 System calls are rewritten into a sequence that loads the address of the first
 runtime call entrypoint and jumps to it. The runtime call entrypoint table is
-stored at the start of the sandbox, so it can be referenced by ``x27``. The
-rewrite also saves and restores the link register, since it is used for
-branching into the runtime.
-
-+-----------------+----------------------------+
-|    Original     |         Rewritten          |
-+-----------------+----------------------------+
-| .. code-block:: | .. code-block::            |
-|                 |                            |
-|    svc #0       |    mov w26, w30            |
-|                 |    ldr x30, [x27]          |
-|                 |    blr x30                 |
-|                 |    add x30, x27, w26, uxtw |
-|                 |                            |
-+-----------------+----------------------------+
+stored at a negative offset from the sandbox base, so it can be referenced by
+``x27``. The rewrite also saves and restores the link register, since it is
+used for branching into the runtime.
+
++-----------------+------------------------------+
+|    Original     |          Rewritten           |
++-----------------+------------------------------+
+| .. code-block:: | .. code-block::              |
+|                 |                              |
+|    svc #0       |    mov x26, x30              |
+|                 |    ldur x30, [x27, #-8]      |
+|                 |    blr x30                   |
+|                 |    add x30, x27, w26, uxtw   |
+|                 |                              |
++-----------------+------------------------------+
 
 Thread-local storage
 ~~~~~~~~~~~~~~~~~~~~
 
-TLS accesses are rewritten into accesses offset from ``x25``, which is a
-reserved register that points to a virtual register file, with a location for
-storing the sandbox's thread pointer. ``TP`` is the offset into that virtual
-register file where the thread pointer is stored.
-
-+----------------------+-----------------------+
-|       Original       |       Rewritten       |
-+----------------------+-----------------------+
-| .. code-block::      | .. code-block::       |
-|                      |                       |
-|    mrs xN, tpidr_el0 |    ldr xN, [x25, #TP] |
-|                      |                       |
-+----------------------+-----------------------+
-| .. code-block::      | .. code-block::       |
-|                      |                       |
-|    mrs tpidr_el0, xN |    str xN, [x25, #TP] |
-|                      |                       |
-+----------------------+-----------------------+
+TLS accesses are rewritten into loads/stores from the context register
+(``x25``), which holds the virtual thread pointer at offset 32 (see
+`Context Register`_).
+
++----------------------+-------------------------+
+|       Original       |        Rewritten        |
++----------------------+-------------------------+
+| .. code-block::      | .. code-block::         |
+|                      |                         |
+|    mrs xN, tpidr_el0 |    ldr xN, [x25, #32]   |
+|                      |                         |
++----------------------+-------------------------+
+| .. code-block::      | .. code-block::         |
+|                      |                         |
+|    msr tpidr_el0, xN |    str xN, [x25, #32]   |
+|                      |                         |
++----------------------+-------------------------+
 
 Optimizations
 =============
@@ -335,22 +456,14 @@ can be removed.
 Address generation
 ~~~~~~~~~~~~~~~~~~
 
+**Note**: this optimization has not been implemented.
+
 Addresses to global symbols in position-independent executables are frequently
 generated via ``adrp`` followed by ``ldr``. Since the address generated by
 ``adrp`` can be statically guaranteed to be within the sandbox, it is safe to
 directly target ``x28`` for these sequences. This allows the omission of a
 guard instruction before the ``ldr``.
 
-+----------------------+-----------------------+
-|       Original       |       Rewritten       |
-+----------------------+-----------------------+
-| .. code-block::      | .. code-block::       |
-|                      |                       |
-|    adrp xN, target   |    adrp x28, target   |
-|    ldr xN, [xN, imm] |    ldr xN, [x28, imm] |
-|                      |                       |
-+----------------------+-----------------------+
-
 Stack guard elimination
 ~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/llvm/include/llvm/MC/MCLFIRewriter.h b/llvm/include/llvm/MC/MCLFIRewriter.h
index 90f8a9b0e0c09..95972202a25c6 100644
--- a/llvm/include/llvm/MC/MCLFIRewriter.h
+++ b/llvm/include/llvm/MC/MCLFIRewriter.h
@@ -41,6 +41,7 @@ class MCLFIRewriter {
       : Ctx(Ctx), InstInfo(std::move(II)), RegInfo(std::move(RI)) {}
 
   LLVM_ABI void error(const MCInst &Inst, const char Msg[]);
+  LLVM_ABI void warning(const MCInst &Inst, const char Msg[]);
 
   void disable() { Enabled = false; }
   void enable() { Enabled = true; }
@@ -61,7 +62,10 @@ class MCLFIRewriter {
 
   // Called when a label is emitted. Used for optimizations that require
   // information about jump targets, such as guard elimination.
-  virtual void onLabel(const MCSymbol *Symbol) {}
+  virtual void onLabel(const MCSymbol *Symbol, MCStreamer &Out) {}
+
+  // Called at the end of the stream to flush any pending state.
+  virtual void finish(MCStreamer &Out) {}
 };
 
 } // namespace llvm
diff --git a/llvm/lib/MC/MCAsmStreamer.cpp b/llvm/lib/MC/MCAsmStreamer.cpp
index 1a50ae43cd9c9..cea014effa121 100644
--- a/llvm/lib/MC/MCAsmStreamer.cpp
+++ b/llvm/lib/MC/MCAsmStreamer.cpp
@@ -2581,6 +2581,9 @@ void MCAsmStreamer::emitRawTextImpl(StringRef String) {
 }
 
 void MCAsmStreamer::finishImpl() {
+  if (LFIRewriter)
+    LFIRewriter->finish(*this);
+
   // If we are generating dwarf for assembly source files dump out the sections.
   if (getContext().getGenDwarfForAssembly())
     MCGenDwarfInfo::Emit(this);
diff --git a/llvm/lib/MC/MCLFIRewriter.cpp b/llvm/lib/MC/MCLFIRewriter.cpp
index 0ffbc02689aa2..61e64988cd041 100644
--- a/llvm/lib/MC/MCLFIRewriter.cpp
+++ b/llvm/lib/MC/MCLFIRewriter.cpp
@@ -23,6 +23,10 @@ void MCLFIRewriter::error(const MCInst &Inst, const char Msg[]) {
   Ctx.reportError(Inst.getLoc(), Msg);
 }
 
+void MCLFIRewriter::warning(const MCInst &Inst, const char Msg[]) {
+  Ctx.reportWarning(Inst.getLoc(), Msg);
+}
+
 bool MCLFIRewriter::isCall(const MCInst &Inst) const {
   return InstInfo->get(Inst.getOpcode()).isCall();
 }
diff --git a/llvm/lib/MC/MCObjectStreamer.cpp b/llvm/lib/MC/MCObjectStreamer.cpp
index 58aa7945d7393..48eb6b6186dec 100644
--- a/llvm/lib/MC/MCObjectStreamer.cpp
+++ b/llvm/lib/MC/MCObjectStreamer.cpp
@@ -791,6 +791,9 @@ void MCObjectStreamer::emitAddrsigSym(const MCSymbol *Sym) {
 }
 
 void MCObjectStreamer::finishImpl() {
+  if (LFIRewriter)
+    LFIRewriter->finish(*this);
+
   getContext().RemapDebugPaths();
 
   // If we are generating dwarf for assembly source files dump out the sections.
diff --git a/llvm/lib/MC/MCStreamer.cpp b/llvm/lib/MC/MCStreamer.cpp
index 33c9a05bec114..685e82d6a3633 100644
--- a/llvm/lib/MC/MCStreamer.cpp
+++ b/llvm/lib/MC/MCStreamer.cpp
@@ -400,7 +400,7 @@ void MCStreamer::emitLabel(MCSymbol *Symbol, SMLoc Loc) {
   Symbol->setFragment(&getCurrentSectionOnly()->getDummyFragment());
 
   if (LFIRewriter)
-    LFIRewriter->onLabel(Symbol);
+    LFIRewriter->onLabel(Symbol, *this);
 
   MCTargetStreamer *TS = getTargetStreamer();
   if (TS)
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index faee640a910d0..c49658510bfbb 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -1060,7 +1060,13 @@ def FeatureHardenSlsNoComdat : SubtargetFeature<"harden-sls-nocomdat",
   "HardenSlsNoComdat", "true",
   "Generate thunk code for SLS mitigation in the normal text section">;
 
-
+// LFI (Lightweight Fault Isolation) features.
+// By default, both loads and stores are sandboxed. Use +no-lfi-loads for
+// stores-only mode, or +no-lfi-loads+no-lfi-stores for jumps-only mode.
+def FeatureNoLFILoads : SubtargetFeature<"no-lfi-loads", "NoLFILoads", "true",
+  "Disable LFI sandboxing for load instructions (stores-only mode)">;
+def FeatureNoLFIStores : SubtargetFeature<"no-lfi-stores", "NoLFIStores", "true",
+  "Disable LFI sandboxing for store instructions">;
 // Only intended to be used by disassemblers.
 def FeatureAll
     : SubtargetFeature<"all", "IsAll", "true", "Enable all instructions">;
diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
index 7d4e034ca16c8..8c63ad42da29a 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrFormats.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
@@ -75,6 +75,24 @@ def SMEMatrixTileD : SMEMatrixTypeEnum<4>;
 def SMEMatrixTileQ : SMEMatrixTypeEnum<5>;
 def SMEMatrixArray : SMEMatrixTypeEnum<6>;
 
+// Memory operation addressing mode classification for load/store instructions.
+// Used to identify operand layout for memory operations.
+class MemOpAddrModeEnum<bits<5> val> {
+  bits<5> Value = val;
+}
+def MemOpAddrModeNone       : MemOpAddrModeEnum<0>;  // Not a memory operation
+def MemOpAddrModeIndexed    : MemOpAddrModeEnum<1>;  // [Xn, #imm]
+def MemOpAddrModeUnscaled   : MemOpAddrModeEnum<2>;  // [Xn, #simm] (unscaled)
+def MemOpAddrModePreIdx     : MemOpAddrModeEnum<3>;  // [Xn, #imm]!
+def MemOpAddrModePostIdx    : MemOpAddrModeEnum<4>;  // [Xn], #imm
+def MemOpAddrModeRegOff     : MemOpAddrModeEnum<5>;  // [Xn, Xm, extend]
+def MemOpAddrModeLiteral    : MemOpAddrModeEnum<6>;  // PC-relative literal
+def MemOpAddrModeNoIdx      : MemOpAddrModeEnum<7>;  // [Xn] (no offset)
+def MemOpAddrModePair       : MemOpAddrModeEnum<8>;  // LDP/STP [Xn, #imm]
+def MemOpAddrModePairPre    : MemOpAddrModeEnum<9>;  // LDP/STP [Xn, #imm]!
+def MemOpAddrModePairPost   : MemOpAddrModeEnum<10>; // LDP/STP [Xn], #imm
+def MemOpAddrModePostIdxReg : MemOpAddrModeEnum<11>; // [Xn], Xm (SIMD post-index)
+
 // AArch64 Instruction Format
 class AArch64Inst<Format f, string cstr> : Instruction {
   field bits<32> Inst; // Instruction encoding.
@@ -97,7 +115,19 @@ class AArch64Inst<Format f, string cstr> : Instruction {
   DestructiveInstTypeEnum DestructiveInstType = NotDestructive;
   SMEMatrixTypeEnum SMEMatrixType = SMEMatrixNone;
   ElementSizeEnum ElementSize = ElementSizeNone;
-
+  MemOpAddrModeEnum MemOpAddrMode = MemOpAddrModeNone;
+  // Index of the base register operand for memory instructions.
+  // 0 means not set (non-memory instruction). This means operand index 0
+  // cannot be used as a base register, which is fine for AArch64 since
+  // operand 0 is always the data/destination register.
+  bits<4> MemOpBaseIdx = 0;
+  // Index of the offset operand for memory instructions.
+  // 0 means not set (no offset operand).
+  bits<4> MemOpOffsetIdx = 0;
+
+  let TSFlags{26-23} = MemOpOffsetIdx;
+  let TSFlags{22-19} = MemOpBaseIdx;
+  let TSFlags{18-14} = MemOpAddrMode.Value;
   let TSFlags{13-11} = SMEMatrixType.Value;
   let TSFlags{10}    = isPTestLike;
   let TSFlags{9}     = isWhile;
@@ -2172,7 +2202,7 @@ class SpecialReturn<bits<4> opc, string asm>
   let Inst{9-5} = 0b11111;
 }
 
-let mayLoad = 1 in
+let mayLoad = 1, MemOpAddrMode = MemOpAddrModeNoIdx in
 class RCPCLoad<bits<2> sz, string asm, RegisterClass RC>
   : I<(outs RC:$Rt), (ins GPR64sp0:$Rn), asm, "\t$Rt, [$Rn]", "", []>,
   Sched<[WriteLD]> {
@@ -2182,6 +2212,7 @@ class RCPCLoad<bits<2> sz, string asm, RegisterClass RC>
   let Inst{29-10} = 0b11100010111111110000;
   let Inst{9-5} = Rn;
   let Inst{4-0} = Rt;
+  let MemOpBaseIdx = 1;  // Rt, [Rn]
 }
 
 class AuthBase<bits<1> M, dag oops, dag iops, string asm, string operands,
@@ -3815,6 +3846,9 @@ class BaseLoadStoreUI<bits<2> sz, bit V, bits<2> opc, dag oops, dag iops,
   let Inst{4-0}   = Rt;
 
   let DecoderMethod = "DecodeUnsignedLdStInstruction";
+  let MemOpAddrMode = MemOpAddrModeIndexed;
+  let MemOpBaseIdx = 1;  // Rt, [Rn, #imm]
+  let MemOpOffsetIdx = 2;
 }
 
 multiclass LoadUI<bits<2> sz, bit V, bits<2> opc, DAGOperand regtype,
@@ -3902,6 +3936,7 @@ class LoadLiteral<bits<2> opc, bit V, RegisterOperand regtype, string asm, list<
   let Inst{25-24} = 0b00;
   let Inst{23-5}  = label;
   let Inst{4-0}   = Rt;
+  let MemOpAddrMode = MemOpAddrModeLiteral;
 }
 
 let mayLoad = 0, mayStore = 0, hasSideEffects = 1 in
@@ -3917,6 +3952,7 @@ class PrefetchLiteral<bits<2> opc, bit V, string asm, list<dag> pat>
   let Inst{25-24} = 0b00;
   let Inst{23-5}  = label;
   let Inst{4-0}   = Rt;
+  let MemOpAddrMode = MemOpAddrModeLiteral;
 }
 
 //---
@@ -4055,6 +4091,9 @@ class LoadStore8RO<bits<2> sz, bit V, bits<2> opc, string asm, dag ins,
   let Inst{11-10} = 0b10;
   let Inst{9-5}   = Rn;
   let Inst{4-0}   = Rt;
+  let MemOpAddrMode = MemOpAddrModeRegOff;
+  let MemOpBaseIdx = 1;  // Rt, [Rn, Rm, extend]
+  let MemOpOffsetIdx = 2;
 }
 
 class ROInstAlias<string asm, DAGOperand regtype, Instruction INST,
@@ -4134,6 +4173,9 @@ class LoadStore16RO<bits<2> sz, bit V, bits<2> opc, string asm, dag ins,
   let Inst{11-10} = 0b10;
   let Inst{9-5}   = Rn;
   let Inst{4-0}   = Rt;
+  let MemOpAddrMode = MemOpAddrModeRegOff;
+  let MemOpBaseIdx = 1;  // Rt, [Rn, Rm, extend]
+  let MemOpOffsetIdx = 2;
 }
 
 multiclass Load16RO<bits<2> sz, bit V, bits<2> opc, DAGOperand regtype,
@@ -4206,6 +4248,9 @@ class LoadStore32RO<bits<2> sz, bit V, bits<2> opc, string asm, dag ins,
   let Inst{11-10} = 0b10;
   let Inst{9-5}   = Rn;
   let Inst{4-0}   = Rt;
+  let MemOpAddrMode = MemOpAddrModeRegOff;
+  let MemOpBaseIdx = 1;  // Rt, [Rn, Rm, extend]
+  let MemOpOffsetIdx = 2;
 }
 
 multiclass Load32RO<bits<2> sz, bit V, bits<2> opc, DAGOperand regtype,
@@ -4278,6 +4323,9 @@ class LoadStore64RO<bits<2> sz, bit V, bits<2> opc, string asm, dag ins,
   let Inst{11-10} = 0b10;
   let Inst{9-5}   = Rn;
   let Inst{4-0}   = Rt;
+  let MemOpAddrMode = MemOpAddrModeRegOff;
+  let MemOpBaseIdx = 1;  // Rt, [Rn, Rm, extend]
+  let MemOpOffsetIdx = 2;
 }
 
 multiclass Load64RO<bits<2> sz, bit V, bits<2> opc, DAGOperand regtype,
@@ -4350,6 +4398,9 @@ class LoadStore128RO<bits<2> sz, bit V, bits<2> opc, string asm, dag ins,
   let Inst{11-10} = 0b10;
   let Inst{9-5}   = Rn;
   let Inst{4-0}   = Rt;
+  let MemOpAddrMode = MemOpAddrModeRegOff;
+  let MemOpBaseIdx = 1;  // Rt, [Rn, Rm, extend]
+  let MemOpOffsetIdx = 2;
 }
 
 multiclass Load128RO<bits<2> sz, bit V, bits<2> opc, DAGOperand regtype,
@@ -4492,6 +4543,9 @@ class BaseLoadStoreUnscale<bits<2> sz, bit V, bits<2> opc, dag oops, dag iops,
   let Inst{4-0}   = Rt;
 
   let DecoderMethod = "DecodeSignedLdStInstruction";
+  let MemOpAddrMode = MemOpAddrModeUnscaled;
+  let MemOpBaseIdx = 1;  // Rt, [Rn, #simm]
+  let MemOpOffsetIdx = 2;
 }
 
 // Armv8.4 LDAPR & STLR with Immediate Offset instruction
@@ -4625,6 +4679,9 @@ class BaseLoadStorePreIdx<bits<2> sz, bit V, bits<2> opc, dag oops, dag iops,
   let Inst{4-0}   = Rt;
 
   let DecoderMethod = "DecodeSignedLdStInstruction";
+  let MemOpAddrMode = MemOpAddrModePreIdx;
+  let MemOpBaseIdx = 2;  // wback, Rt, [Rn, #imm]!
+  let MemOpOffsetIdx = 3;
 }
 
 let hasSideEffects = 0 in {
@@ -4671,6 +4728,9 @@ class BaseLoadStorePostIdx<bits<2> sz, bit V, bits<2> opc, dag oops, dag iops,
   let Inst{4-0}   = Rt;
 
   let DecoderMethod = "DecodeSignedLdStInstruction";
+  let MemOpAddrMode = MemOpAddrModePostIdx;
+  let MemOpBaseIdx = 2;  // wback, Rt, [Rn], #imm
+  let MemOpOffsetIdx = 3;
 }
 
 let hasSideEffects = 0 in {
@@ -4720,6 +4780,9 @@ class BaseLoadStorePairOffset<bits<2> opc, bit V, bit L, dag oops, dag iops,
   let Inst{4-0}   = Rt;
 
   let DecoderMethod = "DecodePairLdStInstruction";
+  let MemOpAddrMode = MemOpAddrModePair;
+  let MemOpBaseIdx = 2;  // Rt, Rt2, [Rn, #imm]
+  let MemOpOffsetIdx = 3;
 }
 
 multiclass LoadPairOffset<bits<2> opc, bit V, RegisterOperand regtype,
@@ -4811,6 +4874,9 @@ class BaseLoadStorePairPreIdx<bits<2> opc, bit V, bit L, dag oops, dag iops,
   let Inst{4-0}   = Rt;
 
   let DecoderMethod = "DecodePairLdStInstruction";
+  let MemOpAddrMode = MemOpAddrModePairPre;
+  let MemOpBaseIdx = 3;  // wback, Rt, Rt2, [Rn, #imm]!
+  let MemOpOffsetIdx = 4;
 }
 
 let hasSideEffects = 0 in {
@@ -4852,6 +4918,9 @@ class BaseLoadStorePairPostIdx<bits<2> opc, bit V, bit L, dag oops, dag iops,
   let Inst{4-0}   = Rt;
 
   let DecoderMethod = "DecodePairLdStInstruction";
+  let MemOpAddrMode = MemOpAddrModePairPost;
+  let MemOpBaseIdx = 3;  // wback, Rt, Rt2, [Rn], #imm
+  let MemOpOffsetIdx = 4;
 }
 
 let hasSideEffects = 0 in {
@@ -5004,6 +5073,8 @@ class BaseLoadStoreExclusive<bits<2> sz, bit o2, bit L, bit o1, bit o0,
   let Inst{15}    = o0;
 
   let DecoderMethod = "DecodeExclusiveLdStInstruction";
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
 }
 
 // Neither Rs nor Rt2 operands.
@@ -5020,6 +5091,7 @@ class LoadStoreExclusiveSimple<bits<2> sz, bit o2, bit L, bit o1, bit o0,
   let Inst{4-0} = Rt;
 
   let PostEncoderMethod = "fixLoadStoreExclusive<0,0>";
+  let MemOpBaseIdx = 1;  // Rt, [Rn]
 }
 
 // Simple load acquires don't set the exclusive monitor
@@ -5053,6 +5125,7 @@ class LoadExclusivePair<bits<2> sz, bit o2, bit L, bit o1, bit o0,
   let Inst{4-0} = Rt;
 
   let PostEncoderMethod = "fixLoadStoreExclusive<0,1>";
+  let MemOpBaseIdx = 2;  // Rt, Rt2, [Rn]
 }
 
 // Armv9.6-a load-store exclusive instructions
@@ -5065,6 +5138,8 @@ class BaseLoadStoreExclusiveLSUI<bits<2> sz, bit L, bit o0,
   let Inst{22}    = L;
   let Inst{21}    = 0b0;
   let Inst{15}    = o0;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
 }
 
 
@@ -5086,6 +5161,7 @@ class LoadExclusiveLSUI<bits<2> sz, bit L, bit o0,
   let Inst{4-0} = Rt;
 
   let PostEncoderMethod = "fixLoadStoreExclusive<0,0>";
+  let MemOpBaseIdx = 1;  // Rt, [Rn]
 }
 
  class StoreExclusiveLSUI<bits<2> sz, bit L, bit o0,
@@ -5106,6 +5182,7 @@ class LoadExclusiveLSUI<bits<2> sz, bit L, bit o0,
 
    let Constraints = "@earlyclobber $Ws";
    let PostEncoderMethod = "fixLoadStoreExclusive<1,0>";
+   let MemOpBaseIdx = 2;  // Ws, Rt, [Rn]
  }
 
 // Armv9.6-a load-store unprivileged instructions
@@ -5125,6 +5202,9 @@ class BaseLoadUnprivilegedLSUI<bits<2> sz, dag oops, dag iops, string asm>
    let Inst{9-5} = Rn;
    let Inst{4-0} = Rt;
    let PostEncoderMethod = "fixLoadStoreExclusive<0,0>";
+
+   let MemOpAddrMode = MemOpAddrModeNoIdx;
+   let MemOpBaseIdx = 1;  // Rt, [Rn]
 }
 
 multiclass LoadUnprivilegedLSUI<bits<2> sz, RegisterClass regtype, string asm> {
@@ -5151,6 +5231,9 @@ class BaseStoreUnprivilegedLSUI<bits<2> sz, dag oops, dag iops, string asm>
    let Inst{4-0} = Rt;
    let PostEncoderMethod = "fixLoadStoreExclusive<1,0>";
    let mayStore = 1;
+
+   let MemOpAddrMode = MemOpAddrModeNoIdx;
+   let MemOpBaseIdx = 2;  // Ws, Rt, [Rn]
 }
 
 multiclass StoreUnprivilegedLSUI<bits<2> sz, RegisterClass regtype, string asm> {
@@ -5185,6 +5268,7 @@ class StoreExclusive<bits<2> sz, bit o2, bit L, bit o1, bit o0,
 
   let Constraints = "@earlyclobber $Ws";
   let PostEncoderMethod = "fixLoadStoreExclusive<1,0>";
+  let MemOpBaseIdx = 2;  // Ws, Rt, [Rn]
 }
 
 class StoreExclusivePair<bits<2> sz, bit o2, bit L, bit o1, bit o0,
@@ -5204,6 +5288,7 @@ class StoreExclusivePair<bits<2> sz, bit o2, bit L, bit o1, bit o0,
   let Inst{4-0} = Rt;
 
   let Constraints = "@earlyclobber $Ws";
+  let MemOpBaseIdx = 3;  // Ws, Rt, Rt2, [Rn]
 }
 
 // Armv8.5-A Memory Tagging Extension
@@ -10909,6 +10994,9 @@ class BaseSIMDLdSt<bit Q, bit L, bits<4> opcode, bits<2> size,
   let Inst{11-10} = size;
   let Inst{9-5} = Rn;
   let Inst{4-0} = Vt;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 1;  // Vt, [Rn]
 }
 
 class BaseSIMDLdStPost<bit Q, bit L, bits<4> opcode, bits<2> size,
@@ -10927,6 +11015,10 @@ class BaseSIMDLdStPost<bit Q, bit L, bits<4> opcode, bits<2> size,
   let Inst{11-10} = size;
   let Inst{9-5} = Rn;
   let Inst{4-0} = Vt;
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 2;  // wback, Vt, [Rn], Xm
+  let MemOpOffsetIdx = 3;
 }
 
 // The immediate form of AdvSIMD post-indexed addressing is encoded with
@@ -11241,6 +11333,9 @@ class BaseSIMDLdR<bit Q, bit R, bits<3> opcode, bit S, bits<2> size, string asm,
   let Inst{20-16} = 0b00000;
   let Inst{12} = S;
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 1;  // Vt, [Rn]
 }
 let mayLoad = 1, mayStore = 0, hasSideEffects = 0 in
 class BaseSIMDLdRPost<bit Q, bit R, bits<3> opcode, bit S, bits<2> size,
@@ -11255,6 +11350,10 @@ class BaseSIMDLdRPost<bit Q, bit R, bits<3> opcode, bit S, bits<2> size,
   let Inst{20-16} = Xm;
   let Inst{12} = S;
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 2;  // wback, Vt, [Rn], Xm
+  let MemOpOffsetIdx = 3;
 }
 
 multiclass SIMDLdrAliases<string BaseName, string asm, string layout, string Count,
@@ -11364,6 +11463,9 @@ class SIMDLdStSingleB<bit L, bit R, bits<3> opcode, string asm,
   let Inst{20-16} = 0b00000;
   let Inst{12} = idx{2};
   let Inst{11-10} = idx{1-0};
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 2;  // Vt, idx, [Rn]
 }
 class SIMDLdStSingleBTied<bit L, bit R, bits<3> opcode, string asm,
                       dag oops, dag iops, list<dag> pattern>
@@ -11376,6 +11478,9 @@ class SIMDLdStSingleBTied<bit L, bit R, bits<3> opcode, string asm,
   let Inst{20-16} = 0b00000;
   let Inst{12} = idx{2};
   let Inst{11-10} = idx{1-0};
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 3;  // dst, Vt, idx, [Rn]
 }
 class SIMDLdStSingleBPost<bit L, bit R, bits<3> opcode, string asm,
                           dag oops, dag iops>
@@ -11389,6 +11494,10 @@ class SIMDLdStSingleBPost<bit L, bit R, bits<3> opcode, string asm,
   let Inst{20-16} = Xm;
   let Inst{12} = idx{2};
   let Inst{11-10} = idx{1-0};
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 3;  // wback, Vt, idx, [Rn], Xm
+  let MemOpOffsetIdx = 4;
 }
 class SIMDLdStSingleBTiedPost<bit L, bit R, bits<3> opcode, string asm,
                           dag oops, dag iops>
@@ -11402,6 +11511,10 @@ class SIMDLdStSingleBTiedPost<bit L, bit R, bits<3> opcode, string asm,
   let Inst{20-16} = Xm;
   let Inst{12} = idx{2};
   let Inst{11-10} = idx{1-0};
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 4;  // wback, dst, Vt, idx, [Rn], Xm
+  let MemOpOffsetIdx = 5;
 }
 
 class SIMDLdStSingleH<bit L, bit R, bits<3> opcode, bit size, string asm,
@@ -11416,6 +11529,9 @@ class SIMDLdStSingleH<bit L, bit R, bits<3> opcode, bit size, string asm,
   let Inst{12} = idx{1};
   let Inst{11} = idx{0};
   let Inst{10} = size;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 2;  // Vt, idx, [Rn]
 }
 class SIMDLdStSingleHTied<bit L, bit R, bits<3> opcode, bit size, string asm,
                       dag oops, dag iops, list<dag> pattern>
@@ -11429,6 +11545,9 @@ class SIMDLdStSingleHTied<bit L, bit R, bits<3> opcode, bit size, string asm,
   let Inst{12} = idx{1};
   let Inst{11} = idx{0};
   let Inst{10} = size;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 3;  // dst, Vt, idx, [Rn]
 }
 
 class SIMDLdStSingleHPost<bit L, bit R, bits<3> opcode, bit size, string asm,
@@ -11444,6 +11563,10 @@ class SIMDLdStSingleHPost<bit L, bit R, bits<3> opcode, bit size, string asm,
   let Inst{12} = idx{1};
   let Inst{11} = idx{0};
   let Inst{10} = size;
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 3;  // wback, Vt, idx, [Rn], Xm
+  let MemOpOffsetIdx = 4;
 }
 class SIMDLdStSingleHTiedPost<bit L, bit R, bits<3> opcode, bit size, string asm,
                           dag oops, dag iops>
@@ -11458,6 +11581,10 @@ class SIMDLdStSingleHTiedPost<bit L, bit R, bits<3> opcode, bit size, string asm
   let Inst{12} = idx{1};
   let Inst{11} = idx{0};
   let Inst{10} = size;
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 4;  // wback, dst, Vt, idx, [Rn], Xm
+  let MemOpOffsetIdx = 5;
 }
 class SIMDLdStSingleS<bit L, bit R, bits<3> opcode, bits<2> size, string asm,
                       dag oops, dag iops, list<dag> pattern>
@@ -11470,6 +11597,9 @@ class SIMDLdStSingleS<bit L, bit R, bits<3> opcode, bits<2> size, string asm,
   let Inst{20-16} = 0b00000;
   let Inst{12} = idx{0};
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 2;  // Vt, idx, [Rn]
 }
 class SIMDLdStSingleSTied<bit L, bit R, bits<3> opcode, bits<2> size, string asm,
                       dag oops, dag iops, list<dag> pattern>
@@ -11482,6 +11612,9 @@ class SIMDLdStSingleSTied<bit L, bit R, bits<3> opcode, bits<2> size, string asm
   let Inst{20-16} = 0b00000;
   let Inst{12} = idx{0};
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 3;  // dst, Vt, idx, [Rn]
 }
 class SIMDLdStSingleSPost<bit L, bit R, bits<3> opcode, bits<2> size,
                           string asm, dag oops, dag iops>
@@ -11495,6 +11628,10 @@ class SIMDLdStSingleSPost<bit L, bit R, bits<3> opcode, bits<2> size,
   let Inst{20-16} = Xm;
   let Inst{12} = idx{0};
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 3;  // wback, Vt, idx, [Rn], Xm
+  let MemOpOffsetIdx = 4;
 }
 class SIMDLdStSingleSTiedPost<bit L, bit R, bits<3> opcode, bits<2> size,
                           string asm, dag oops, dag iops>
@@ -11508,6 +11645,10 @@ class SIMDLdStSingleSTiedPost<bit L, bit R, bits<3> opcode, bits<2> size,
   let Inst{20-16} = Xm;
   let Inst{12} = idx{0};
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 4;  // wback, dst, Vt, idx, [Rn], Xm
+  let MemOpOffsetIdx = 5;
 }
 class SIMDLdStSingleD<bit L, bit R, bits<3> opcode, bits<2> size, string asm,
                       dag oops, dag iops, list<dag> pattern>
@@ -11520,6 +11661,9 @@ class SIMDLdStSingleD<bit L, bit R, bits<3> opcode, bits<2> size, string asm,
   let Inst{20-16} = 0b00000;
   let Inst{12} = 0;
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 2;  // Vt, idx, [Rn]
 }
 class SIMDLdStSingleDTied<bit L, bit R, bits<3> opcode, bits<2> size, string asm,
                       dag oops, dag iops, list<dag> pattern>
@@ -11532,6 +11676,9 @@ class SIMDLdStSingleDTied<bit L, bit R, bits<3> opcode, bits<2> size, string asm
   let Inst{20-16} = 0b00000;
   let Inst{12} = 0;
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 3;  // dst, Vt, idx, [Rn]
 }
 class SIMDLdStSingleDPost<bit L, bit R, bits<3> opcode, bits<2> size,
                           string asm, dag oops, dag iops>
@@ -11545,6 +11692,10 @@ class SIMDLdStSingleDPost<bit L, bit R, bits<3> opcode, bits<2> size,
   let Inst{20-16} = Xm;
   let Inst{12} = 0;
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 3;  // wback, Vt, idx, [Rn], Xm
+  let MemOpOffsetIdx = 4;
 }
 class SIMDLdStSingleDTiedPost<bit L, bit R, bits<3> opcode, bits<2> size,
                           string asm, dag oops, dag iops>
@@ -11558,6 +11709,10 @@ class SIMDLdStSingleDTiedPost<bit L, bit R, bits<3> opcode, bits<2> size,
   let Inst{20-16} = Xm;
   let Inst{12} = 0;
   let Inst{11-10} = size;
+
+  let MemOpAddrMode = MemOpAddrModePostIdxReg;
+  let MemOpBaseIdx = 4;  // wback, dst, Vt, idx, [Rn], Xm
+  let MemOpOffsetIdx = 5;
 }
 
 let mayLoad = 1, mayStore = 0, hasSideEffects = 0 in
@@ -12333,6 +12488,9 @@ class BaseCASEncoding<dag oops, dag iops, string asm, string operands,
   let Inst{9-5} = Rn;
   let Inst{4-0} = Rt;
   let Predicates = [HasLSE];
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 3;  // out, Rs, Rt, [Rn]
 }
 
 class BaseCAS<string order, string size, RegisterClass RC>
@@ -12386,6 +12544,9 @@ class BaseCASTEncoding<dag oops, dag iops, string asm,
   let Unpredictable{14-10} = 0b11111;
   let Inst{9-5} = Rn;
   let Inst{4-0} = Rt;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 3;  // out, Rs, Rt, [Rn]
 }
 
 multiclass CompareAndSwapUnprivileged<bits<2> Sz, bit L, bit o0, string order> {
@@ -12431,6 +12592,9 @@ class BaseSWP<string order, string size, RegisterClass RC>
   let Inst{9-5} = Rn;
   let Inst{4-0} = Rt;
   let Predicates = [HasLSE];
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 2;  // Rt, Rs, [Rn]
 }
 
 multiclass Swap<bits<1> Acq, bits<1> Rel, string order> {
@@ -12462,6 +12626,9 @@ class BaseSWPLSUI<string order, RegisterClass RC>
    let Inst{11-10} = 0b01;
    let Inst{9-5} = Rn;
    let Inst{4-0} = Rt;
+
+   let MemOpAddrMode = MemOpAddrModeNoIdx;
+   let MemOpBaseIdx = 2;  // Rt, Rs, [Rn]
 }
 
 multiclass SwapLSUI<bits<1> Acq, bits<1> Rel, string order> {
@@ -12493,6 +12660,9 @@ class BaseLDOPregister<string op, string order, string size, RegisterClass RC>
   let Inst{9-5} = Rn;
   let Inst{4-0} = Rt;
   let Predicates = [HasLSE];
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 2;  // Rt, Rs, [Rn]
 }
 
 multiclass LDOPregister<bits<3> opc, string op, bits<1> Acq, bits<1> Rel,
@@ -12529,6 +12699,9 @@ class BaseLDOPregisterLSUI<string op, string order, RegisterClass RC>
   let Inst{11-10} = 0b01;
   let Inst{9-5} = Rn;
   let Inst{4-0} = Rt;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 2;  // Rt, Rs, [Rn]
 }
 
 
@@ -13457,6 +13630,9 @@ class BaseAtomicFPLoad<RegisterClass regtype, bits<2> sz, bits<2> AR,
   let Inst{11-10} = 0b00;
   let Inst{9-5}   = Rn;
   let Inst{4-0}   = Rt;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 2;  // Rt, Rs, [Rn]
 }
 
 multiclass AtomicFPLoad<bits<2> AR, bits<3> op0, string asm> {
@@ -13486,6 +13662,9 @@ class BaseAtomicFPStore<RegisterClass regtype, bits<2> sz, bit R,
   let Inst{11-10} = 0b00;
   let Inst{9-5}   = Rn;
   let Inst{4-0}   = 0b11111;
+
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 1;  // Rs, [Rn]
 }
 
 multiclass AtomicFPStore<bit R, bits<3> op0, string asm> {
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 6c4a778b10f3f..94faae893c220 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -148,8 +148,10 @@ unsigned AArch64InstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {
   // Specific cases handle instructions of variable sizes
   switch (Desc.getOpcode()) {
   default:
-    if (Desc.getSize())
-      return Desc.getSize();
+    if (Desc.getSize()) {
+      NumBytes = Desc.getSize();
+      break;
+    }
 
     // Anything not explicitly designated otherwise (i.e. pseudo-instructions
     // with fixed constant size but not specified in .td file) is a normal
@@ -199,6 +201,14 @@ unsigned AArch64InstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {
     break;
   }
 
+  if (Subtarget.getTargetTriple().isLFI()) {
+    // Loads and stores may be expanded to include an additional guard
+    // instruction, so we overestimate the size here to allow things like
+    // branch relaxation to be more accurate.
+    if (Desc.mayLoad() || Desc.mayStore())
+      NumBytes += 4;
+  }
+
   return NumBytes;
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp b/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
index 652844e9dc591..2e2a6ac610889 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
@@ -366,6 +366,11 @@ AArch64TargetMachine::AArch64TargetMachine(const Target &T, const Triple &TT,
     this->Options.NoTrapAfterNoreturn = true;
   }
 
+  // Disable jump table compression for LFI since it may cause assembler errors
+  // after LFI instrumentation if branch distances were incorrectly estimated.
+  if (TT.isLFI())
+    EnableCompressJumpTables = false;
+
   if (getMCAsmInfo()->usesWindowsCFI()) {
     // Unwinding can get confused if the last instruction in an
     // exception-handling region (function, funclet, try block, etc.)
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.cpp
new file mode 100644
index 0000000000000..3cd080cf2412a
--- /dev/null
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.cpp
@@ -0,0 +1,2070 @@
+//===- AArch64MCLFIRewriter.cpp ---------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements the AArch64MCLFIRewriter class, the AArch64 specific
+// subclass of MCLFIRewriter.
+//
+//===----------------------------------------------------------------------===//
+
+#include "AArch64MCLFIRewriter.h"
+#include "AArch64AddressingModes.h"
+#include "MCTargetDesc/AArch64MCTargetDesc.h"
+#include "Utils/AArch64BaseInfo.h"
+
+#include "llvm/MC/MCInst.h"
+#include "llvm/MC/MCInstrDesc.h"
+#include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCStreamer.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include "llvm/Support/CommandLine.h"
+
+using namespace llvm;
+
+static cl::opt<bool>
+    NoLFIGuardElim("aarch64-lfi-no-guard-elim", cl::Hidden,
+                   cl::desc("Disable LFI guard elimination optimization"),
+                   cl::init(false));
+
+namespace {
+// LFI reserved registers.
+constexpr MCRegister LFIBaseReg = AArch64::X27;
+constexpr MCRegister LFIAddrReg = AArch64::X28;
+constexpr MCRegister LFIScratchReg = AArch64::X26;
+constexpr MCRegister LFICtxReg = AArch64::X25;
+
+// Offset into the context register block (pointed to by LFICtxReg) where the
+// thread pointer is stored. This is a scaled offset (multiplied by 8 for
+// 64-bit loads), so a value of 4 means an actual byte offset of 32.
+constexpr unsigned LFITPOffset = 4;
+
+unsigned convertUiToRoW(unsigned Op);
+unsigned convertPreToRoW(unsigned Op);
+unsigned convertPostToRoW(unsigned Op);
+unsigned convertRoXToRoW(unsigned Op, unsigned &Shift);
+bool getRoWShift(unsigned Op, unsigned &Shift);
+unsigned getPrePostScale(unsigned Op);
+unsigned convertPrePostToBase(unsigned Op, bool &IsPre, bool &IsNoOffset);
+int getSIMDNaturalOffset(unsigned Op);
+
+bool isSyscall(const MCInst &Inst) { return Inst.getOpcode() == AArch64::SVC; }
+
+// Instructions that have mayLoad/mayStore set in TableGen but don't actually
+// perform memory accesses (barriers, hints, waits).
+bool isNotMemAccess(const MCInst &Inst) {
+  switch (Inst.getOpcode()) {
+  case AArch64::DMB:
+  case AArch64::DSB:
+  case AArch64::ISB:
+  case AArch64::HINT:
+    return true;
+  default:
+    return false;
+  }
+}
+
+bool isTLSRead(const MCInst &Inst) {
+  return Inst.getOpcode() == AArch64::MRS &&
+         Inst.getOperand(1).getImm() == AArch64SysReg::TPIDR_EL0;
+}
+
+bool isTLSWrite(const MCInst &Inst) {
+  return Inst.getOpcode() == AArch64::MSR &&
+         Inst.getOperand(0).getImm() == AArch64SysReg::TPIDR_EL0;
+}
+
+bool mayPrefetch(const MCInst &Inst) {
+  switch (Inst.getOpcode()) {
+  case AArch64::PRFMl:
+  case AArch64::PRFMroW:
+  case AArch64::PRFMroX:
+  case AArch64::PRFMui:
+  case AArch64::PRFUMi:
+    return true;
+  default:
+    return false;
+  }
+}
+
+bool isPACIASP(const MCInst &Inst) {
+  return Inst.getOpcode() == AArch64::PACIASP ||
+         (Inst.getOpcode() == AArch64::HINT &&
+          Inst.getOperand(0).getImm() == 25);
+}
+
+bool isDCZVA(const MCInst &Inst) {
+  // DC ZVA is encoded as SYSxt with op1=3, Cn=7, Cm=4, op2=1
+  if (Inst.getOpcode() != AArch64::SYSxt)
+    return false;
+  return Inst.getOperand(0).getImm() == 3 && // op1
+         Inst.getOperand(1).getImm() == 7 && // Cn
+         Inst.getOperand(2).getImm() == 4 && // Cm
+         Inst.getOperand(3).getImm() == 1;   // op2
+}
+
+bool isAuthenticatedBranch(unsigned Opcode) {
+  switch (Opcode) {
+  case AArch64::BRAA:
+  case AArch64::BRAAZ:
+  case AArch64::BRAB:
+  case AArch64::BRABZ:
+    return true;
+  default:
+    return false;
+  }
+}
+
+bool isAuthenticatedCall(unsigned Opcode) {
+  switch (Opcode) {
+  case AArch64::BLRAA:
+  case AArch64::BLRAAZ:
+  case AArch64::BLRAB:
+  case AArch64::BLRABZ:
+    return true;
+  default:
+    return false;
+  }
+}
+
+bool isAuthenticatedReturn(unsigned Opcode) {
+  switch (Opcode) {
+  case AArch64::RETAA:
+  case AArch64::RETAB:
+    return true;
+  default:
+    return false;
+  }
+}
+
+bool isExceptionReturn(unsigned Opcode) {
+  switch (Opcode) {
+  case AArch64::ERET:
+  case AArch64::ERETAA:
+  case AArch64::ERETAB:
+    return true;
+  default:
+    return false;
+  }
+}
+
+} // anonymous namespace
+
+bool AArch64MCLFIRewriter::mayModifyStack(const MCInst &Inst) const {
+  return mayModifyRegister(Inst, AArch64::SP);
+}
+
+bool AArch64MCLFIRewriter::mayModifyReserved(const MCInst &Inst) const {
+  return mayModifyRegister(Inst, LFIAddrReg) ||
+         mayModifyRegister(Inst, LFIBaseReg) ||
+         mayModifyRegister(Inst, LFICtxReg);
+}
+
+bool AArch64MCLFIRewriter::mayModifyLR(const MCInst &Inst) const {
+  // PACIASP signs LR but doesn't affect control flow safety.
+  if (isPACIASP(Inst))
+    return false;
+  return mayModifyRegister(Inst, AArch64::LR);
+}
+
+void AArch64MCLFIRewriter::onLabel(const MCSymbol *Symbol, MCStreamer &Out) {
+  // Flush deferred LR guard before a label, since labels are potential branch
+  // targets and the code after the label may use LR for control flow.
+  if (DeferredLRGuard && LastSTI) {
+    emitAddMask(AArch64::LR, AArch64::LR, Out, *LastSTI);
+    DeferredLRGuard = false;
+  }
+
+  // Reset the state for guard elimination.
+  ActiveGuard = false;
+}
+
+void AArch64MCLFIRewriter::finish(MCStreamer &Out) {
+  // Flush deferred LR guard at end of stream.
+  if (DeferredLRGuard && LastSTI) {
+    emitAddMask(AArch64::LR, AArch64::LR, Out, *LastSTI);
+    DeferredLRGuard = false;
+  }
+}
+
+void AArch64MCLFIRewriter::emitInst(const MCInst &Inst, MCStreamer &Out,
+                                    const MCSubtargetInfo &STI) {
+  // Guard elimination: invalidate guard if instruction modifies guarded
+  // register, x28 (which holds the guarded value), or affects control flow.
+  if (ActiveGuard) {
+    const MCInstrDesc &Desc = InstInfo->get(Inst.getOpcode());
+    if (Desc.mayAffectControlFlow(Inst, *RegInfo) ||
+        mayModifyRegister(Inst, ActiveGuardReg) ||
+        mayModifyRegister(Inst, getWRegFromXReg(ActiveGuardReg)) ||
+        mayModifyRegister(Inst, LFIAddrReg))
+      ActiveGuard = false;
+  }
+
+  Out.emitInstruction(Inst, STI);
+}
+
+void AArch64MCLFIRewriter::emitAddMask(MCRegister Dest, MCRegister Src,
+                                       MCStreamer &Out,
+                                       const MCSubtargetInfo &STI) {
+  // Guard elimination: skip if same guard already active.
+  if (!NoLFIGuardElim && Dest == LFIAddrReg && ActiveGuard &&
+      ActiveGuardReg == Src)
+    return;
+
+  // add Dest, LFIBaseReg, W(Src), uxtw
+  MCInst Inst;
+  Inst.setOpcode(AArch64::ADDXrx);
+  Inst.addOperand(MCOperand::createReg(Dest));
+  Inst.addOperand(MCOperand::createReg(LFIBaseReg));
+  Inst.addOperand(MCOperand::createReg(getWRegFromXReg(Src)));
+  Inst.addOperand(
+      MCOperand::createImm(AArch64_AM::getArithExtendImm(AArch64_AM::UXTW, 0)));
+  emitInst(Inst, Out, STI);
+
+  // Register Src as an actively guarded value.
+  if (Dest == LFIAddrReg) {
+    ActiveGuard = true;
+    ActiveGuardReg = Src;
+  }
+}
+
+void AArch64MCLFIRewriter::emitBranch(unsigned Opcode, MCRegister Target,
+                                      MCStreamer &Out,
+                                      const MCSubtargetInfo &STI) {
+  MCInst Branch;
+  Branch.setOpcode(Opcode);
+  Branch.addOperand(MCOperand::createReg(Target));
+  emitInst(Branch, Out, STI);
+}
+
+void AArch64MCLFIRewriter::emitMov(MCRegister Dest, MCRegister Src,
+                                   MCStreamer &Out,
+                                   const MCSubtargetInfo &STI) {
+  // orr Dest, xzr, Src
+  MCInst Inst;
+  Inst.setOpcode(AArch64::ORRXrs);
+  Inst.addOperand(MCOperand::createReg(Dest));
+  Inst.addOperand(MCOperand::createReg(AArch64::XZR));
+  Inst.addOperand(MCOperand::createReg(Src));
+  Inst.addOperand(MCOperand::createImm(0));
+  emitInst(Inst, Out, STI);
+}
+
+void AArch64MCLFIRewriter::emitAddImm(MCRegister Dest, MCRegister Src,
+                                      int64_t Imm, MCStreamer &Out,
+                                      const MCSubtargetInfo &STI) {
+  assert(std::abs(Imm) <= 4095);
+  MCInst Inst;
+  if (Imm >= 0) {
+    Inst.setOpcode(AArch64::ADDXri);
+    Inst.addOperand(MCOperand::createReg(Dest));
+    Inst.addOperand(MCOperand::createReg(Src));
+    Inst.addOperand(MCOperand::createImm(Imm));
+    Inst.addOperand(MCOperand::createImm(0)); // shift
+  } else {
+    Inst.setOpcode(AArch64::SUBXri);
+    Inst.addOperand(MCOperand::createReg(Dest));
+    Inst.addOperand(MCOperand::createReg(Src));
+    Inst.addOperand(MCOperand::createImm(-Imm));
+    Inst.addOperand(MCOperand::createImm(0)); // shift
+  }
+  emitInst(Inst, Out, STI);
+}
+
+void AArch64MCLFIRewriter::emitAddReg(MCRegister Dest, MCRegister Src1,
+                                      MCRegister Src2, unsigned Shift,
+                                      MCStreamer &Out,
+                                      const MCSubtargetInfo &STI) {
+  // add Dest, Src1, Src2, lsl #Shift
+  MCInst Inst;
+  Inst.setOpcode(AArch64::ADDXrs);
+  Inst.addOperand(MCOperand::createReg(Dest));
+  Inst.addOperand(MCOperand::createReg(Src1));
+  Inst.addOperand(MCOperand::createReg(Src2));
+  Inst.addOperand(
+      MCOperand::createImm(AArch64_AM::getShifterImm(AArch64_AM::LSL, Shift)));
+  emitInst(Inst, Out, STI);
+}
+
+void AArch64MCLFIRewriter::emitAddRegExtend(MCRegister Dest, MCRegister Src1,
+                                            MCRegister Src2,
+                                            AArch64_AM::ShiftExtendType ExtType,
+                                            unsigned Shift, MCStreamer &Out,
+                                            const MCSubtargetInfo &STI) {
+  // add Dest, Src1, Src2, ExtType #Shift
+  MCInst Inst;
+  if (ExtType == AArch64_AM::SXTX || ExtType == AArch64_AM::UXTX)
+    Inst.setOpcode(AArch64::ADDXrx64);
+  else
+    Inst.setOpcode(AArch64::ADDXrx);
+  Inst.addOperand(MCOperand::createReg(Dest));
+  Inst.addOperand(MCOperand::createReg(Src1));
+  Inst.addOperand(MCOperand::createReg(Src2));
+  Inst.addOperand(
+      MCOperand::createImm(AArch64_AM::getArithExtendImm(ExtType, Shift)));
+  emitInst(Inst, Out, STI);
+}
+
+void AArch64MCLFIRewriter::emitMemRoW(unsigned Opcode, const MCOperand &DataOp,
+                                      MCRegister BaseReg, MCStreamer &Out,
+                                      const MCSubtargetInfo &STI) {
+  // Emits: Op DataOp, [LFIBaseReg, W(BaseReg), uxtw].
+  MCInst Inst;
+  Inst.setOpcode(Opcode);
+  Inst.addOperand(DataOp);
+  Inst.addOperand(MCOperand::createReg(LFIBaseReg));
+  Inst.addOperand(MCOperand::createReg(getWRegFromXReg(BaseReg)));
+  Inst.addOperand(MCOperand::createImm(0)); // S bit = 0 (UXTW).
+  Inst.addOperand(MCOperand::createImm(0)); // Shift amount = 0 (unscaled).
+  emitInst(Inst, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteIndirectBranch(const MCInst &Inst,
+                                                 MCStreamer &Out,
+                                                 const MCSubtargetInfo &STI) {
+  assert(Inst.getOperand(0).isReg());
+  MCRegister BranchReg = Inst.getOperand(0).getReg();
+
+  // Guard the branch target through X28.
+  emitAddMask(LFIAddrReg, BranchReg, Out, STI);
+  emitBranch(Inst.getOpcode(), LFIAddrReg, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteCall(const MCInst &Inst, MCStreamer &Out,
+                                       const MCSubtargetInfo &STI) {
+  if (Inst.getOperand(0).isReg())
+    rewriteIndirectBranch(Inst, Out, STI);
+  else
+    emitInst(Inst, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteReturn(const MCInst &Inst, MCStreamer &Out,
+                                         const MCSubtargetInfo &STI) {
+  if (isExceptionReturn(Inst.getOpcode())) {
+    error(Inst, "exception returns (ERET/ERETAA/ERETAB) are not "
+                "supported by LFI");
+    return;
+  }
+
+  // Regular RET has an operand, handle it normally.
+  assert(Inst.getNumOperands() > 0 && Inst.getOperand(0).isReg());
+  // RET through LR is safe since LR is always within sandbox.
+  if (Inst.getOperand(0).getReg() != AArch64::LR)
+    rewriteIndirectBranch(Inst, Out, STI);
+  else
+    emitInst(Inst, Out, STI);
+}
+
+bool AArch64MCLFIRewriter::rewriteLoadStoreRoW(const MCInst &Inst,
+                                               MCStreamer &Out,
+                                               const MCSubtargetInfo &STI) {
+  unsigned Op = Inst.getOpcode();
+  unsigned MemOp;
+
+  // Case 1: Indexed load/store with immediate offset.
+  // ldr xN, [xM, #0] -> ldr xN, [x27, wM, uxtw]
+  // ldr xN, [xM, #imm] -> fall back to basic (non-zero offset)
+  if ((MemOp = convertUiToRoW(Op)) != AArch64::INSTRUCTION_LIST_END) {
+    MCRegister BaseReg = Inst.getOperand(1).getReg();
+    if (BaseReg == AArch64::SP)
+      return false;
+    const MCOperand &OffsetOp = Inst.getOperand(2);
+    if (OffsetOp.isImm() && OffsetOp.getImm() == 0) {
+      emitMemRoW(MemOp, Inst.getOperand(0), BaseReg, Out, STI);
+      return true;
+    }
+    return false;
+  }
+
+  // Case 2: Pre-index load/store.
+  // ldr xN, [xM, #imm]! -> add xM, xM, #imm; ldr xN, [x27, wM, uxtw]
+  // Pre-index: update base before the access.
+  if ((MemOp = convertPreToRoW(Op)) != AArch64::INSTRUCTION_LIST_END) {
+    MCRegister BaseReg = Inst.getOperand(2).getReg();
+    if (BaseReg == AArch64::SP)
+      return false;
+    int64_t Imm = Inst.getOperand(3).getImm();
+    emitAddImm(BaseReg, BaseReg, Imm, Out, STI);
+    emitMemRoW(MemOp, Inst.getOperand(1), BaseReg, Out, STI);
+    return true;
+  }
+
+  // Case 3: Post-index load/store.
+  // ldr xN, [xM], #imm -> ldr xN, [x27, wM, uxtw]; add xM, xM, #imm
+  // Post-index: update base after the access.
+  if ((MemOp = convertPostToRoW(Op)) != AArch64::INSTRUCTION_LIST_END) {
+    MCRegister BaseReg = Inst.getOperand(2).getReg();
+    if (BaseReg == AArch64::SP)
+      return false;
+    int64_t Imm = Inst.getOperand(3).getImm();
+    emitMemRoW(MemOp, Inst.getOperand(1), BaseReg, Out, STI);
+    emitAddImm(BaseReg, BaseReg, Imm, Out, STI);
+    return true;
+  }
+
+  // Case 4: Register-offset-X load/store.
+  // ldr xN, [xM1, xM2] -> add x26, xM1, xM2; ldr xN, [x27, w26, uxtw]
+  // ldr xN, [xM1, xM2, sxtx #shift] -> add x26, xM1, xM2, sxtx #shift; ldr xN,
+  // [x27, w26, uxtw]
+  unsigned Shift;
+  if ((MemOp = convertRoXToRoW(Op, Shift)) != AArch64::INSTRUCTION_LIST_END) {
+    MCRegister Reg1 = Inst.getOperand(1).getReg();
+    MCRegister Reg2 = Inst.getOperand(2).getReg();
+    int64_t Extend = Inst.getOperand(3).getImm();
+    int64_t IsShift = Inst.getOperand(4).getImm();
+
+    if (!IsShift)
+      Shift = 0;
+
+    if (Extend) {
+      // Sign-extend: add Scratch, Reg1, Reg2, sxtx #Shift
+      emitAddRegExtend(LFIScratchReg, Reg1, Reg2, AArch64_AM::SXTX, Shift, Out,
+                       STI);
+    } else {
+      // No extend: add Scratch, Reg1, Reg2, lsl #Shift
+      emitAddReg(LFIScratchReg, Reg1, Reg2, Shift, Out, STI);
+    }
+    emitMemRoW(MemOp, Inst.getOperand(0), LFIScratchReg, Out, STI);
+    return true;
+  }
+
+  // Case 5: Register-offset-W load/store.
+  // ldr xN, [xM1, wM2, uxtw] -> add x26, xM1, wM2, uxtw;
+  //                             ldr xN, [x27, w26, uxtw]
+  // ldr xN, [xM1, wM2, sxtw #shift] -> add x26, xM1, wM2, sxtw #shift;
+  //                                    ldr xN, [x27, w26, uxtw]
+  if (getRoWShift(Op, Shift)) {
+    MemOp = Op;
+    MCRegister Reg1 = Inst.getOperand(1).getReg();
+    MCRegister Reg2 = Inst.getOperand(2).getReg();
+    int64_t S = Inst.getOperand(3).getImm();
+    int64_t IsShift = Inst.getOperand(4).getImm();
+
+    if (!IsShift)
+      Shift = 0;
+
+    if (S) {
+      // Sign-extend: add Scratch, Reg1, Reg2, sxtw #Shift
+      emitAddRegExtend(LFIScratchReg, Reg1, Reg2, AArch64_AM::SXTW, Shift, Out,
+                       STI);
+    } else {
+      // Unsigned extend: add Scratch, Reg1, Reg2, uxtw #Shift
+      emitAddRegExtend(LFIScratchReg, Reg1, Reg2, AArch64_AM::UXTW, Shift, Out,
+                       STI);
+    }
+    emitMemRoW(MemOp, Inst.getOperand(0), LFIScratchReg, Out, STI);
+    return true;
+  }
+
+  return false;
+}
+
+void AArch64MCLFIRewriter::rewriteLoadStoreBasic(const MCInst &Inst,
+                                                 MCStreamer &Out,
+                                                 const MCSubtargetInfo &STI) {
+  const MCInstrDesc &Desc = InstInfo->get(Inst.getOpcode());
+  uint64_t TSFlags = Desc.TSFlags;
+  unsigned Opcode = Inst.getOpcode();
+
+  uint64_t AddrMode = TSFlags & AArch64::MemOpAddrModeMask;
+  if (AddrMode == AArch64::MemOpAddrModeLiteral) {
+    error(Inst, "PC-relative literal loads are not supported in LFI");
+    return;
+  }
+
+  int BaseIdx = AArch64::getMemOpBaseRegIdx(TSFlags);
+  if (BaseIdx < 0) {
+    warning(Inst, "memory instruction not sandboxed: unknown addressing mode");
+    emitInst(Inst, Out, STI);
+    return;
+  }
+
+  MCRegister BaseReg = Inst.getOperand(BaseIdx).getReg();
+
+  // Stack accesses without register offset don't need rewriting.
+  if (BaseReg == AArch64::SP) {
+    int OffsetIdx = AArch64::getMemOpOffsetIdx(TSFlags);
+    if (OffsetIdx < 0 || !Inst.getOperand(OffsetIdx).isReg()) {
+      emitInst(Inst, Out, STI);
+      return;
+    }
+  }
+
+  // Guard the base register.
+  emitAddMask(LFIAddrReg, BaseReg, Out, STI);
+
+  // Check if this is a pre/post-index instruction that needs special handling.
+  bool IsPrePostIdx = AArch64::isMemOpPrePostIdx(TSFlags);
+  bool IsPre = false;
+  bool IsNoOffset = false;
+  unsigned BaseOpcode = convertPrePostToBase(Opcode, IsPre, IsNoOffset);
+
+  if (IsPrePostIdx && BaseOpcode != AArch64::INSTRUCTION_LIST_END) {
+    // This is a pair instruction (LDP/STP) with pre/post-index.
+    // We need to demote it to the base indexed form.
+    //
+    // For pre-index:  ldp x0, x1, [x2, #16]! -> ldp x0, x1, [x28, #16];
+    //                                           add x2, x2, #16
+    // For post-index: ldp x0, x1, [x2], #16  -> ldp x0, x1, [x28];
+    //                                           add x2, x2, #16
+    MCInst NewInst;
+    NewInst.setOpcode(BaseOpcode);
+    NewInst.setLoc(Inst.getLoc());
+
+    // Copy operands up to (but not including) the base register.
+    // For LDPXpre: operands are [wback, Rt, Rt2, Rn, #imm]
+    // We skip wback (operand 0) and copy Rt, Rt2, then add LFIAddrReg.
+    for (int I = 1; I < BaseIdx; ++I)
+      NewInst.addOperand(Inst.getOperand(I));
+
+    // Add the guarded base register.
+    NewInst.addOperand(MCOperand::createReg(LFIAddrReg));
+
+    // For pre-index, include the offset; for post-index, use zero.
+    int OffsetIdx = AArch64::getMemOpOffsetIdx(TSFlags);
+    if (IsPre && OffsetIdx >= 0) {
+      NewInst.addOperand(Inst.getOperand(OffsetIdx));
+    } else if (!IsNoOffset) {
+      NewInst.addOperand(MCOperand::createImm(0));
+    }
+    emitInst(NewInst, Out, STI);
+
+    // Update the base register with the scaled offset.
+    if (OffsetIdx >= 0) {
+      const MCOperand &OffsetOp = Inst.getOperand(OffsetIdx);
+      if (OffsetOp.isImm()) {
+        int64_t Scale = getPrePostScale(Opcode);
+        int64_t Offset = OffsetOp.getImm() * Scale;
+        emitAddImm(BaseReg, BaseReg, Offset, Out, STI);
+      } else if (OffsetOp.isReg()) {
+        // SIMD post-index uses a register offset (XZR for natural offset).
+        MCRegister OffReg = OffsetOp.getReg();
+        if (OffReg == AArch64::XZR) {
+          int NaturalOffset = getSIMDNaturalOffset(Opcode);
+          if (NaturalOffset > 0) {
+            emitAddImm(BaseReg, BaseReg, NaturalOffset, Out, STI);
+          }
+        } else if (OffReg != AArch64::WZR) {
+          // Regular register offset.
+          emitAddReg(BaseReg, BaseReg, OffReg, 0, Out, STI);
+        }
+      }
+    }
+  } else if (IsPrePostIdx) {
+    // All scalar pre/post-index instructions are handled by
+    // rewriteLoadStoreRoW, and all pair/SIMD pre/post-index instructions are
+    // handled above. This path should not be reachable.
+    error(Inst, "unhandled pre/post-index instruction without uxtw form in LFI "
+                "rewriter");
+  } else {
+    // Non-pre/post instruction: just replace the base register.
+    MCInst NewInst;
+    NewInst.setOpcode(Opcode);
+    NewInst.setLoc(Inst.getLoc());
+    for (unsigned I = 0; I < Inst.getNumOperands(); ++I) {
+      if ((int)I == BaseIdx) {
+        NewInst.addOperand(MCOperand::createReg(LFIAddrReg));
+      } else {
+        NewInst.addOperand(Inst.getOperand(I));
+      }
+    }
+    emitInst(NewInst, Out, STI);
+  }
+}
+
+void AArch64MCLFIRewriter::rewriteLoadStore(const MCInst &Inst, MCStreamer &Out,
+                                            const MCSubtargetInfo &STI) {
+  bool IsStore = mayStore(Inst);
+  bool IsLoad = mayLoad(Inst) || mayPrefetch(Inst);
+
+  // Check if this memory access needs sandboxing based on LFI mode.
+  // - Default: sandbox both loads and stores
+  // - +no-lfi-loads: stores-only mode, skip loads
+  // - +no-lfi-loads+no-lfi-stores: jumps-only mode, skip all memory
+  bool SkipLoads = STI.hasFeature(AArch64::FeatureNoLFILoads);
+  bool SkipStores = STI.hasFeature(AArch64::FeatureNoLFIStores);
+
+  if ((!IsLoad || SkipLoads) && (!IsStore || SkipStores)) {
+    emitInst(Inst, Out, STI);
+    return;
+  }
+
+  // Try RoW optimization first, then fall back to basic rewriting.
+  if (rewriteLoadStoreRoW(Inst, Out, STI))
+    return;
+
+  rewriteLoadStoreBasic(Inst, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteStackModification(
+    const MCInst &Inst, MCStreamer &Out, const MCSubtargetInfo &STI) {
+  // If this is a load/store that also modifies SP (like push/pop patterns),
+  // handle the memory access first.
+  if (mayLoad(Inst) || mayStore(Inst)) {
+    if (mayModifyLR(Inst))
+      return rewriteLRModification(Inst, Out, STI);
+    emitInst(Inst, Out, STI);
+    return;
+  }
+
+  // In jumps-only mode (+no-lfi-loads+no-lfi-stores), no stack sandboxing
+  // needed.
+  bool SkipLoads = STI.hasFeature(AArch64::FeatureNoLFILoads);
+  bool SkipStores = STI.hasFeature(AArch64::FeatureNoLFIStores);
+  if (SkipLoads && SkipStores) {
+    emitInst(Inst, Out, STI);
+    return;
+  }
+
+  // Redirect SP modification to scratch, then sandbox.
+  MCInst ModInst;
+  ModInst.setOpcode(Inst.getOpcode());
+  ModInst.setLoc(Inst.getLoc());
+
+  assert(Inst.getOperand(0).isReg() &&
+         Inst.getOperand(0).getReg() == AArch64::SP);
+
+  ModInst.addOperand(MCOperand::createReg(LFIScratchReg));
+  for (unsigned I = 1, E = Inst.getNumOperands(); I != E; ++I)
+    ModInst.addOperand(Inst.getOperand(I));
+
+  emitInst(ModInst, Out, STI);
+  emitAddMask(AArch64::SP, LFIScratchReg, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteLRModification(const MCInst &Inst,
+                                                 MCStreamer &Out,
+                                                 const MCSubtargetInfo &STI) {
+  // Emit the instruction with memory sandboxing if needed.
+  if (mayLoad(Inst) || mayStore(Inst))
+    rewriteLoadStore(Inst, Out, STI);
+  else
+    emitInst(Inst, Out, STI);
+
+  // Defer the LR guard until the next control flow instruction.
+  //
+  // This allows for compatibility with PAC authentication by allowing for the
+  // authentication instruction to run before the mask (which destroys the PAC
+  // bits).
+  DeferredLRGuard = true;
+}
+
+void AArch64MCLFIRewriter::rewriteAuthenticatedReturn(
+    const MCInst &Inst, MCStreamer &Out, const MCSubtargetInfo &STI) {
+  // Expand RETAA/RETAB to: AUTIASP/AUTIBSP, guard LR, RET
+  unsigned Opcode = Inst.getOpcode();
+
+  // Emit the appropriate AUTxSP instruction.
+  MCInst Auth;
+  if (Opcode == AArch64::RETAA)
+    Auth.setOpcode(AArch64::AUTIASP);
+  else
+    Auth.setOpcode(AArch64::AUTIBSP);
+  emitInst(Auth, Out, STI);
+
+  // Guard LR and emit RET.
+  emitAddMask(AArch64::LR, AArch64::LR, Out, STI);
+
+  MCInst Ret;
+  Ret.setOpcode(AArch64::RET);
+  Ret.addOperand(MCOperand::createReg(AArch64::LR));
+  emitInst(Ret, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteAuthenticatedBranchOrCall(
+    const MCInst &Inst, unsigned BranchOpcode, MCStreamer &Out,
+    const MCSubtargetInfo &STI) {
+  unsigned Opcode = Inst.getOpcode();
+  MCRegister TargetReg = Inst.getOperand(0).getReg();
+
+  MCInst Auth;
+  switch (Opcode) {
+  case AArch64::BRAA:
+  case AArch64::BLRAA:
+    Auth.setOpcode(AArch64::AUTIA);
+    Auth.addOperand(MCOperand::createReg(TargetReg)); // dst
+    Auth.addOperand(MCOperand::createReg(TargetReg)); // src (tied)
+    Auth.addOperand(Inst.getOperand(1));              // modifier
+    break;
+  case AArch64::BRAAZ:
+  case AArch64::BLRAAZ:
+    Auth.setOpcode(AArch64::AUTIZA);
+    Auth.addOperand(MCOperand::createReg(TargetReg)); // dst
+    Auth.addOperand(MCOperand::createReg(TargetReg)); // src (tied)
+    break;
+  case AArch64::BRAB:
+  case AArch64::BLRAB:
+    Auth.setOpcode(AArch64::AUTIB);
+    Auth.addOperand(MCOperand::createReg(TargetReg)); // dst
+    Auth.addOperand(MCOperand::createReg(TargetReg)); // src (tied)
+    Auth.addOperand(Inst.getOperand(1));              // modifier
+    break;
+  case AArch64::BRABZ:
+  case AArch64::BLRABZ:
+    Auth.setOpcode(AArch64::AUTIZB);
+    Auth.addOperand(MCOperand::createReg(TargetReg)); // dst
+    Auth.addOperand(MCOperand::createReg(TargetReg)); // src (tied)
+    break;
+  default:
+    llvm_unreachable("unexpected authenticated branch/call opcode");
+  }
+  emitInst(Auth, Out, STI);
+
+  // Guard the target and branch/call.
+  emitAddMask(LFIAddrReg, TargetReg, Out, STI);
+  emitBranch(BranchOpcode, LFIAddrReg, Out, STI);
+}
+
+void AArch64MCLFIRewriter::emitSyscall(MCStreamer &Out,
+                                       const MCSubtargetInfo &STI) {
+  // Save LR to scratch.
+  emitMov(LFIScratchReg, AArch64::LR, Out, STI);
+
+  // Load syscall handler address from negative offset from sandbox base.
+  MCInst Load;
+  Load.setOpcode(AArch64::LDURXi);
+  Load.addOperand(MCOperand::createReg(AArch64::LR));
+  Load.addOperand(MCOperand::createReg(LFIBaseReg));
+  Load.addOperand(MCOperand::createImm(-8));
+  emitInst(Load, Out, STI);
+
+  // Call the runtime.
+  emitBranch(AArch64::BLR, AArch64::LR, Out, STI);
+
+  // Restore LR with guard.
+  emitAddMask(AArch64::LR, LFIScratchReg, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteSyscall(const MCInst &, MCStreamer &Out,
+                                          const MCSubtargetInfo &STI) {
+  emitSyscall(Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteTLSRead(const MCInst &Inst, MCStreamer &Out,
+                                          const MCSubtargetInfo &STI) {
+  // mrs xN, tpidr_el0 -> ldr xN, [x25, #TP]
+  MCRegister DestReg = Inst.getOperand(0).getReg();
+
+  MCInst Load;
+  Load.setOpcode(AArch64::LDRXui);
+  Load.addOperand(MCOperand::createReg(DestReg));
+  Load.addOperand(MCOperand::createReg(LFICtxReg));
+  Load.addOperand(MCOperand::createImm(LFITPOffset));
+  emitInst(Load, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteTLSWrite(const MCInst &Inst, MCStreamer &Out,
+                                           const MCSubtargetInfo &STI) {
+  // msr tpidr_el0, xN -> str xN, [x25, #TP]
+  MCRegister SrcReg = Inst.getOperand(1).getReg();
+
+  MCInst Store;
+  Store.setOpcode(AArch64::STRXui);
+  Store.addOperand(MCOperand::createReg(SrcReg));
+  Store.addOperand(MCOperand::createReg(LFICtxReg));
+  Store.addOperand(MCOperand::createImm(LFITPOffset));
+  emitInst(Store, Out, STI);
+}
+
+void AArch64MCLFIRewriter::rewriteDCZVA(const MCInst &Inst, MCStreamer &Out,
+                                        const MCSubtargetInfo &STI) {
+  // dc zva, xN -> add x28, x27, wN, uxtw; dc zva, x28
+  MCRegister AddrReg = Inst.getOperand(4).getReg();
+
+  emitAddMask(LFIAddrReg, AddrReg, Out, STI);
+
+  MCInst NewInst;
+  NewInst.setOpcode(AArch64::SYSxt);
+  NewInst.addOperand(Inst.getOperand(0)); // op1
+  NewInst.addOperand(Inst.getOperand(1)); // Cn
+  NewInst.addOperand(Inst.getOperand(2)); // Cm
+  NewInst.addOperand(Inst.getOperand(3)); // op2
+  NewInst.addOperand(MCOperand::createReg(LFIAddrReg));
+  emitInst(NewInst, Out, STI);
+}
+
+void AArch64MCLFIRewriter::doRewriteInst(const MCInst &Inst, MCStreamer &Out,
+                                         const MCSubtargetInfo &STI) {
+  // Reserved register modification is an error.
+  if (mayModifyReserved(Inst)) {
+    error(Inst, "illegal modification of reserved LFI register");
+    return;
+  }
+
+  // System instructions.
+  if (isSyscall(Inst))
+    return rewriteSyscall(Inst, Out, STI);
+
+  if (isTLSRead(Inst))
+    return rewriteTLSRead(Inst, Out, STI);
+
+  if (isTLSWrite(Inst))
+    return rewriteTLSWrite(Inst, Out, STI);
+
+  if (isDCZVA(Inst))
+    return rewriteDCZVA(Inst, Out, STI);
+
+  // Authenticated PAC instructions are expanded to their component operations.
+  if (isAuthenticatedReturn(Inst.getOpcode()))
+    return rewriteAuthenticatedReturn(Inst, Out, STI);
+
+  if (isAuthenticatedBranch(Inst.getOpcode()))
+    return rewriteAuthenticatedBranchOrCall(Inst, AArch64::BR, Out, STI);
+
+  if (isAuthenticatedCall(Inst.getOpcode()))
+    return rewriteAuthenticatedBranchOrCall(Inst, AArch64::BLR, Out, STI);
+
+  // Emit deferred LR guard before control flow instructions.
+  if (DeferredLRGuard) {
+    if (isReturn(Inst) || isIndirectBranch(Inst) || isCall(Inst) ||
+        isBranch(Inst)) {
+      emitAddMask(AArch64::LR, AArch64::LR, Out, STI);
+      DeferredLRGuard = false;
+    }
+  }
+
+  // Control flow.
+  if (isReturn(Inst))
+    return rewriteReturn(Inst, Out, STI);
+
+  if (isIndirectBranch(Inst))
+    return rewriteIndirectBranch(Inst, Out, STI);
+
+  if (isCall(Inst))
+    return rewriteCall(Inst, Out, STI);
+
+  if (isBranch(Inst))
+    return emitInst(Inst, Out, STI);
+
+  // Register modifications that require sandboxing.
+  if (mayModifyStack(Inst))
+    return rewriteStackModification(Inst, Out, STI);
+
+  if (mayModifyLR(Inst))
+    return rewriteLRModification(Inst, Out, STI);
+
+  if (!isNotMemAccess(Inst) &&
+      (mayLoad(Inst) || mayStore(Inst) || mayPrefetch(Inst)))
+    return rewriteLoadStore(Inst, Out, STI);
+
+  emitInst(Inst, Out, STI);
+}
+
+bool AArch64MCLFIRewriter::rewriteInst(const MCInst &Inst, MCStreamer &Out,
+                                       const MCSubtargetInfo &STI) {
+  if (!Enabled || Guard)
+    return false;
+  Guard = true;
+  LastSTI = &STI;
+
+  doRewriteInst(Inst, Out, STI);
+
+  Guard = false;
+  return true;
+}
+
+namespace {
+
+// RoW (Register-offset-W) Opcode Conversion Tables
+//
+// These tables convert various load/store addressing modes to the
+// register-offset-W form ([X27, Wn, uxtw]) which provides sandboxing in a
+// single instruction by zero-extending the 32-bit offset register.
+
+// Convert indexed (ui) load/store to RoW form.
+// Example: LDRXui -> LDRXroW
+unsigned convertUiToRoW(unsigned Op) {
+  switch (Op) {
+  case AArch64::LDRBBui:
+    return AArch64::LDRBBroW;
+  case AArch64::LDRBui:
+    return AArch64::LDRBroW;
+  case AArch64::LDRDui:
+    return AArch64::LDRDroW;
+  case AArch64::LDRHHui:
+    return AArch64::LDRHHroW;
+  case AArch64::LDRHui:
+    return AArch64::LDRHroW;
+  case AArch64::LDRQui:
+    return AArch64::LDRQroW;
+  case AArch64::LDRSBWui:
+    return AArch64::LDRSBWroW;
+  case AArch64::LDRSBXui:
+    return AArch64::LDRSBXroW;
+  case AArch64::LDRSHWui:
+    return AArch64::LDRSHWroW;
+  case AArch64::LDRSHXui:
+    return AArch64::LDRSHXroW;
+  case AArch64::LDRSWui:
+    return AArch64::LDRSWroW;
+  case AArch64::LDRSui:
+    return AArch64::LDRSroW;
+  case AArch64::LDRWui:
+    return AArch64::LDRWroW;
+  case AArch64::LDRXui:
+    return AArch64::LDRXroW;
+  case AArch64::PRFMui:
+    return AArch64::PRFMroW;
+  case AArch64::STRBBui:
+    return AArch64::STRBBroW;
+  case AArch64::STRBui:
+    return AArch64::STRBroW;
+  case AArch64::STRDui:
+    return AArch64::STRDroW;
+  case AArch64::STRHHui:
+    return AArch64::STRHHroW;
+  case AArch64::STRHui:
+    return AArch64::STRHroW;
+  case AArch64::STRQui:
+    return AArch64::STRQroW;
+  case AArch64::STRSui:
+    return AArch64::STRSroW;
+  case AArch64::STRWui:
+    return AArch64::STRWroW;
+  case AArch64::STRXui:
+    return AArch64::STRXroW;
+  default:
+    return AArch64::INSTRUCTION_LIST_END;
+  }
+}
+
+// Convert pre-index load/store to RoW form.
+unsigned convertPreToRoW(unsigned Op) {
+  switch (Op) {
+  case AArch64::LDRBBpre:
+    return AArch64::LDRBBroW;
+  case AArch64::LDRBpre:
+    return AArch64::LDRBroW;
+  case AArch64::LDRDpre:
+    return AArch64::LDRDroW;
+  case AArch64::LDRHHpre:
+    return AArch64::LDRHHroW;
+  case AArch64::LDRHpre:
+    return AArch64::LDRHroW;
+  case AArch64::LDRQpre:
+    return AArch64::LDRQroW;
+  case AArch64::LDRSBWpre:
+    return AArch64::LDRSBWroW;
+  case AArch64::LDRSBXpre:
+    return AArch64::LDRSBXroW;
+  case AArch64::LDRSHWpre:
+    return AArch64::LDRSHWroW;
+  case AArch64::LDRSHXpre:
+    return AArch64::LDRSHXroW;
+  case AArch64::LDRSWpre:
+    return AArch64::LDRSWroW;
+  case AArch64::LDRSpre:
+    return AArch64::LDRSroW;
+  case AArch64::LDRWpre:
+    return AArch64::LDRWroW;
+  case AArch64::LDRXpre:
+    return AArch64::LDRXroW;
+  case AArch64::STRBBpre:
+    return AArch64::STRBBroW;
+  case AArch64::STRBpre:
+    return AArch64::STRBroW;
+  case AArch64::STRDpre:
+    return AArch64::STRDroW;
+  case AArch64::STRHHpre:
+    return AArch64::STRHHroW;
+  case AArch64::STRHpre:
+    return AArch64::STRHroW;
+  case AArch64::STRQpre:
+    return AArch64::STRQroW;
+  case AArch64::STRSpre:
+    return AArch64::STRSroW;
+  case AArch64::STRWpre:
+    return AArch64::STRWroW;
+  case AArch64::STRXpre:
+    return AArch64::STRXroW;
+  default:
+    return AArch64::INSTRUCTION_LIST_END;
+  }
+}
+
+// Convert post-index load/store to RoW form.
+unsigned convertPostToRoW(unsigned Op) {
+  switch (Op) {
+  case AArch64::LDRBBpost:
+    return AArch64::LDRBBroW;
+  case AArch64::LDRBpost:
+    return AArch64::LDRBroW;
+  case AArch64::LDRDpost:
+    return AArch64::LDRDroW;
+  case AArch64::LDRHHpost:
+    return AArch64::LDRHHroW;
+  case AArch64::LDRHpost:
+    return AArch64::LDRHroW;
+  case AArch64::LDRQpost:
+    return AArch64::LDRQroW;
+  case AArch64::LDRSBWpost:
+    return AArch64::LDRSBWroW;
+  case AArch64::LDRSBXpost:
+    return AArch64::LDRSBXroW;
+  case AArch64::LDRSHWpost:
+    return AArch64::LDRSHWroW;
+  case AArch64::LDRSHXpost:
+    return AArch64::LDRSHXroW;
+  case AArch64::LDRSWpost:
+    return AArch64::LDRSWroW;
+  case AArch64::LDRSpost:
+    return AArch64::LDRSroW;
+  case AArch64::LDRWpost:
+    return AArch64::LDRWroW;
+  case AArch64::LDRXpost:
+    return AArch64::LDRXroW;
+  case AArch64::STRBBpost:
+    return AArch64::STRBBroW;
+  case AArch64::STRBpost:
+    return AArch64::STRBroW;
+  case AArch64::STRDpost:
+    return AArch64::STRDroW;
+  case AArch64::STRHHpost:
+    return AArch64::STRHHroW;
+  case AArch64::STRHpost:
+    return AArch64::STRHroW;
+  case AArch64::STRQpost:
+    return AArch64::STRQroW;
+  case AArch64::STRSpost:
+    return AArch64::STRSroW;
+  case AArch64::STRWpost:
+    return AArch64::STRWroW;
+  case AArch64::STRXpost:
+    return AArch64::STRXroW;
+  default:
+    return AArch64::INSTRUCTION_LIST_END;
+  }
+}
+
+// Convert register-offset-X to RoW form, also returns the shift amount.
+unsigned convertRoXToRoW(unsigned Op, unsigned &Shift) {
+  Shift = 0;
+  switch (Op) {
+  case AArch64::LDRBBroX:
+    return AArch64::LDRBBroW;
+  case AArch64::LDRBroX:
+    return AArch64::LDRBroW;
+  case AArch64::LDRDroX:
+    Shift = 3;
+    return AArch64::LDRDroW;
+  case AArch64::LDRHHroX:
+    Shift = 1;
+    return AArch64::LDRHHroW;
+  case AArch64::LDRHroX:
+    Shift = 1;
+    return AArch64::LDRHroW;
+  case AArch64::LDRQroX:
+    Shift = 4;
+    return AArch64::LDRQroW;
+  case AArch64::LDRSBWroX:
+    return AArch64::LDRSBWroW;
+  case AArch64::LDRSBXroX:
+    return AArch64::LDRSBXroW;
+  case AArch64::LDRSHWroX:
+    Shift = 1;
+    return AArch64::LDRSHWroW;
+  case AArch64::LDRSHXroX:
+    Shift = 1;
+    return AArch64::LDRSHXroW;
+  case AArch64::LDRSWroX:
+    Shift = 2;
+    return AArch64::LDRSWroW;
+  case AArch64::LDRSroX:
+    Shift = 2;
+    return AArch64::LDRSroW;
+  case AArch64::LDRWroX:
+    Shift = 2;
+    return AArch64::LDRWroW;
+  case AArch64::LDRXroX:
+    Shift = 3;
+    return AArch64::LDRXroW;
+  case AArch64::PRFMroX:
+    Shift = 3;
+    return AArch64::PRFMroW;
+  case AArch64::STRBBroX:
+    return AArch64::STRBBroW;
+  case AArch64::STRBroX:
+    return AArch64::STRBroW;
+  case AArch64::STRDroX:
+    Shift = 3;
+    return AArch64::STRDroW;
+  case AArch64::STRHHroX:
+    Shift = 1;
+    return AArch64::STRHHroW;
+  case AArch64::STRHroX:
+    Shift = 1;
+    return AArch64::STRHroW;
+  case AArch64::STRQroX:
+    Shift = 4;
+    return AArch64::STRQroW;
+  case AArch64::STRSroX:
+    Shift = 2;
+    return AArch64::STRSroW;
+  case AArch64::STRWroX:
+    Shift = 2;
+    return AArch64::STRWroW;
+  case AArch64::STRXroX:
+    Shift = 3;
+    return AArch64::STRXroW;
+  default:
+    return AArch64::INSTRUCTION_LIST_END;
+  }
+}
+
+// Check if Op is a register-offset-W instruction and return its shift amount.
+// Returns true if recognized, false otherwise.
+bool getRoWShift(unsigned Op, unsigned &Shift) {
+  Shift = 0;
+  switch (Op) {
+  case AArch64::LDRBBroW:
+  case AArch64::LDRBroW:
+  case AArch64::LDRSBWroW:
+  case AArch64::LDRSBXroW:
+  case AArch64::STRBBroW:
+  case AArch64::STRBroW:
+    return true;
+  case AArch64::LDRHHroW:
+  case AArch64::LDRHroW:
+  case AArch64::LDRSHWroW:
+  case AArch64::LDRSHXroW:
+  case AArch64::STRHHroW:
+  case AArch64::STRHroW:
+    Shift = 1;
+    return true;
+  case AArch64::LDRSWroW:
+  case AArch64::LDRSroW:
+  case AArch64::LDRWroW:
+  case AArch64::STRSroW:
+  case AArch64::STRWroW:
+    Shift = 2;
+    return true;
+  case AArch64::LDRDroW:
+  case AArch64::LDRXroW:
+  case AArch64::PRFMroW:
+  case AArch64::STRDroW:
+  case AArch64::STRXroW:
+    Shift = 3;
+    return true;
+  case AArch64::LDRQroW:
+  case AArch64::STRQroW:
+    Shift = 4;
+    return true;
+  default:
+    return false;
+  }
+}
+
+// Pre/Post-Index Conversion Tables
+//
+// These functions convert pre/post-index instructions to their base indexed
+// form and provide the scaling factor for the immediate offset.
+
+// Get the scaling factor for pair instruction pre/post-index immediates.
+// LDP/STP encode scaled offsets, so we need to multiply by this factor.
+unsigned getPrePostScale(unsigned Op) {
+  switch (Op) {
+  case AArch64::LDPDpost:
+  case AArch64::LDPDpre:
+  case AArch64::STPDpost:
+  case AArch64::STPDpre:
+  case AArch64::LDPXpost:
+  case AArch64::LDPXpre:
+  case AArch64::STPXpost:
+  case AArch64::STPXpre:
+    return 8;
+  case AArch64::LDPQpost:
+  case AArch64::LDPQpre:
+  case AArch64::STPQpost:
+  case AArch64::STPQpre:
+    return 16;
+  case AArch64::LDPSWpost:
+  case AArch64::LDPSWpre:
+  case AArch64::LDPSpost:
+  case AArch64::LDPSpre:
+  case AArch64::STPSpost:
+  case AArch64::STPSpre:
+  case AArch64::LDPWpost:
+  case AArch64::LDPWpre:
+  case AArch64::STPWpost:
+  case AArch64::STPWpre:
+    return 4;
+  default:
+    return 1;
+  }
+}
+
+// Convert pre/post-index opcode to its base indexed form.
+// Also sets IsPre to true if it's a pre-index instruction.
+// Sets IsNoOffset to true if the base form has no offset operand.
+unsigned convertPrePostToBase(unsigned Op, bool &IsPre, bool &IsNoOffset) {
+  IsPre = false;
+  IsNoOffset = false;
+  switch (Op) {
+  // LDP/STP pairs.
+  case AArch64::LDPDpost:
+    return AArch64::LDPDi;
+  case AArch64::LDPDpre:
+    IsPre = true;
+    return AArch64::LDPDi;
+  case AArch64::LDPQpost:
+    return AArch64::LDPQi;
+  case AArch64::LDPQpre:
+    IsPre = true;
+    return AArch64::LDPQi;
+  case AArch64::LDPSWpost:
+    return AArch64::LDPSWi;
+  case AArch64::LDPSWpre:
+    IsPre = true;
+    return AArch64::LDPSWi;
+  case AArch64::LDPSpost:
+    return AArch64::LDPSi;
+  case AArch64::LDPSpre:
+    IsPre = true;
+    return AArch64::LDPSi;
+  case AArch64::LDPWpost:
+    return AArch64::LDPWi;
+  case AArch64::LDPWpre:
+    IsPre = true;
+    return AArch64::LDPWi;
+  case AArch64::LDPXpost:
+    return AArch64::LDPXi;
+  case AArch64::LDPXpre:
+    IsPre = true;
+    return AArch64::LDPXi;
+  case AArch64::STPDpost:
+    return AArch64::STPDi;
+  case AArch64::STPDpre:
+    IsPre = true;
+    return AArch64::STPDi;
+  case AArch64::STPQpost:
+    return AArch64::STPQi;
+  case AArch64::STPQpre:
+    IsPre = true;
+    return AArch64::STPQi;
+  case AArch64::STPSpost:
+    return AArch64::STPSi;
+  case AArch64::STPSpre:
+    IsPre = true;
+    return AArch64::STPSi;
+  case AArch64::STPWpost:
+    return AArch64::STPWi;
+  case AArch64::STPWpre:
+    IsPre = true;
+    return AArch64::STPWi;
+  case AArch64::STPXpost:
+    return AArch64::STPXi;
+  case AArch64::STPXpre:
+    IsPre = true;
+    return AArch64::STPXi;
+  // SIMD single structure post-index.
+  case AArch64::LD1i8_POST:
+    IsNoOffset = true;
+    return AArch64::LD1i8;
+  case AArch64::LD1i16_POST:
+    IsNoOffset = true;
+    return AArch64::LD1i16;
+  case AArch64::LD1i32_POST:
+    IsNoOffset = true;
+    return AArch64::LD1i32;
+  case AArch64::LD1i64_POST:
+    IsNoOffset = true;
+    return AArch64::LD1i64;
+  case AArch64::ST1i8_POST:
+    IsNoOffset = true;
+    return AArch64::ST1i8;
+  case AArch64::ST1i16_POST:
+    IsNoOffset = true;
+    return AArch64::ST1i16;
+  case AArch64::ST1i32_POST:
+    IsNoOffset = true;
+    return AArch64::ST1i32;
+  case AArch64::ST1i64_POST:
+    IsNoOffset = true;
+    return AArch64::ST1i64;
+  case AArch64::LD2i8_POST:
+    IsNoOffset = true;
+    return AArch64::LD2i8;
+  case AArch64::LD2i16_POST:
+    IsNoOffset = true;
+    return AArch64::LD2i16;
+  case AArch64::LD2i32_POST:
+    IsNoOffset = true;
+    return AArch64::LD2i32;
+  case AArch64::LD2i64_POST:
+    IsNoOffset = true;
+    return AArch64::LD2i64;
+  case AArch64::ST2i8_POST:
+    IsNoOffset = true;
+    return AArch64::ST2i8;
+  case AArch64::ST2i16_POST:
+    IsNoOffset = true;
+    return AArch64::ST2i16;
+  case AArch64::ST2i32_POST:
+    IsNoOffset = true;
+    return AArch64::ST2i32;
+  case AArch64::ST2i64_POST:
+    IsNoOffset = true;
+    return AArch64::ST2i64;
+  case AArch64::LD3i8_POST:
+    IsNoOffset = true;
+    return AArch64::LD3i8;
+  case AArch64::LD3i16_POST:
+    IsNoOffset = true;
+    return AArch64::LD3i16;
+  case AArch64::LD3i32_POST:
+    IsNoOffset = true;
+    return AArch64::LD3i32;
+  case AArch64::LD3i64_POST:
+    IsNoOffset = true;
+    return AArch64::LD3i64;
+  case AArch64::ST3i8_POST:
+    IsNoOffset = true;
+    return AArch64::ST3i8;
+  case AArch64::ST3i16_POST:
+    IsNoOffset = true;
+    return AArch64::ST3i16;
+  case AArch64::ST3i32_POST:
+    IsNoOffset = true;
+    return AArch64::ST3i32;
+  case AArch64::ST3i64_POST:
+    IsNoOffset = true;
+    return AArch64::ST3i64;
+  case AArch64::LD4i8_POST:
+    IsNoOffset = true;
+    return AArch64::LD4i8;
+  case AArch64::LD4i16_POST:
+    IsNoOffset = true;
+    return AArch64::LD4i16;
+  case AArch64::LD4i32_POST:
+    IsNoOffset = true;
+    return AArch64::LD4i32;
+  case AArch64::LD4i64_POST:
+    IsNoOffset = true;
+    return AArch64::LD4i64;
+  case AArch64::ST4i8_POST:
+    IsNoOffset = true;
+    return AArch64::ST4i8;
+  case AArch64::ST4i16_POST:
+    IsNoOffset = true;
+    return AArch64::ST4i16;
+  case AArch64::ST4i32_POST:
+    IsNoOffset = true;
+    return AArch64::ST4i32;
+  case AArch64::ST4i64_POST:
+    IsNoOffset = true;
+    return AArch64::ST4i64;
+  // SIMD replicate post-index.
+  case AArch64::LD1Rv8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Rv8b;
+  case AArch64::LD1Rv16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Rv16b;
+  case AArch64::LD1Rv4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Rv4h;
+  case AArch64::LD1Rv8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Rv8h;
+  case AArch64::LD1Rv2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Rv2s;
+  case AArch64::LD1Rv4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Rv4s;
+  case AArch64::LD1Rv1d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Rv1d;
+  case AArch64::LD1Rv2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Rv2d;
+  case AArch64::LD2Rv8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Rv8b;
+  case AArch64::LD2Rv16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Rv16b;
+  case AArch64::LD2Rv4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Rv4h;
+  case AArch64::LD2Rv8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Rv8h;
+  case AArch64::LD2Rv2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Rv2s;
+  case AArch64::LD2Rv4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Rv4s;
+  case AArch64::LD2Rv1d_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Rv1d;
+  case AArch64::LD2Rv2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Rv2d;
+  case AArch64::LD3Rv8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Rv8b;
+  case AArch64::LD3Rv16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Rv16b;
+  case AArch64::LD3Rv4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Rv4h;
+  case AArch64::LD3Rv8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Rv8h;
+  case AArch64::LD3Rv2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Rv2s;
+  case AArch64::LD3Rv4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Rv4s;
+  case AArch64::LD3Rv1d_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Rv1d;
+  case AArch64::LD3Rv2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Rv2d;
+  case AArch64::LD4Rv8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Rv8b;
+  case AArch64::LD4Rv16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Rv16b;
+  case AArch64::LD4Rv4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Rv4h;
+  case AArch64::LD4Rv8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Rv8h;
+  case AArch64::LD4Rv2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Rv2s;
+  case AArch64::LD4Rv4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Rv4s;
+  case AArch64::LD4Rv1d_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Rv1d;
+  case AArch64::LD4Rv2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Rv2d;
+  // SIMD multiple structures post-index (One).
+  case AArch64::LD1Onev8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Onev8b;
+  case AArch64::LD1Onev16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Onev16b;
+  case AArch64::LD1Onev4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Onev4h;
+  case AArch64::LD1Onev8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Onev8h;
+  case AArch64::LD1Onev2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Onev2s;
+  case AArch64::LD1Onev4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Onev4s;
+  case AArch64::LD1Onev1d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Onev1d;
+  case AArch64::LD1Onev2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Onev2d;
+  case AArch64::ST1Onev8b_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Onev8b;
+  case AArch64::ST1Onev16b_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Onev16b;
+  case AArch64::ST1Onev4h_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Onev4h;
+  case AArch64::ST1Onev8h_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Onev8h;
+  case AArch64::ST1Onev2s_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Onev2s;
+  case AArch64::ST1Onev4s_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Onev4s;
+  case AArch64::ST1Onev1d_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Onev1d;
+  case AArch64::ST1Onev2d_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Onev2d;
+  // SIMD multiple structures post-index (Two).
+  case AArch64::LD1Twov8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Twov8b;
+  case AArch64::LD1Twov16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Twov16b;
+  case AArch64::LD1Twov4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Twov4h;
+  case AArch64::LD1Twov8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Twov8h;
+  case AArch64::LD1Twov2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Twov2s;
+  case AArch64::LD1Twov4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Twov4s;
+  case AArch64::LD1Twov1d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Twov1d;
+  case AArch64::LD1Twov2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Twov2d;
+  case AArch64::ST1Twov8b_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Twov8b;
+  case AArch64::ST1Twov16b_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Twov16b;
+  case AArch64::ST1Twov4h_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Twov4h;
+  case AArch64::ST1Twov8h_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Twov8h;
+  case AArch64::ST1Twov2s_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Twov2s;
+  case AArch64::ST1Twov4s_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Twov4s;
+  case AArch64::ST1Twov1d_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Twov1d;
+  case AArch64::ST1Twov2d_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Twov2d;
+  // SIMD multiple structures post-index (Three).
+  case AArch64::LD1Threev8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Threev8b;
+  case AArch64::LD1Threev16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Threev16b;
+  case AArch64::LD1Threev4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Threev4h;
+  case AArch64::LD1Threev8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Threev8h;
+  case AArch64::LD1Threev2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Threev2s;
+  case AArch64::LD1Threev4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Threev4s;
+  case AArch64::LD1Threev1d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Threev1d;
+  case AArch64::LD1Threev2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Threev2d;
+  case AArch64::ST1Threev8b_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Threev8b;
+  case AArch64::ST1Threev16b_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Threev16b;
+  case AArch64::ST1Threev4h_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Threev4h;
+  case AArch64::ST1Threev8h_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Threev8h;
+  case AArch64::ST1Threev2s_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Threev2s;
+  case AArch64::ST1Threev4s_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Threev4s;
+  case AArch64::ST1Threev1d_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Threev1d;
+  case AArch64::ST1Threev2d_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Threev2d;
+  // SIMD multiple structures post-index (Four).
+  case AArch64::LD1Fourv8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Fourv8b;
+  case AArch64::LD1Fourv16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Fourv16b;
+  case AArch64::LD1Fourv4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Fourv4h;
+  case AArch64::LD1Fourv8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Fourv8h;
+  case AArch64::LD1Fourv2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Fourv2s;
+  case AArch64::LD1Fourv4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Fourv4s;
+  case AArch64::LD1Fourv1d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Fourv1d;
+  case AArch64::LD1Fourv2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD1Fourv2d;
+  case AArch64::ST1Fourv8b_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Fourv8b;
+  case AArch64::ST1Fourv16b_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Fourv16b;
+  case AArch64::ST1Fourv4h_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Fourv4h;
+  case AArch64::ST1Fourv8h_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Fourv8h;
+  case AArch64::ST1Fourv2s_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Fourv2s;
+  case AArch64::ST1Fourv4s_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Fourv4s;
+  case AArch64::ST1Fourv1d_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Fourv1d;
+  case AArch64::ST1Fourv2d_POST:
+    IsNoOffset = true;
+    return AArch64::ST1Fourv2d;
+  // LD2/ST2 multiple structures.
+  case AArch64::LD2Twov8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Twov8b;
+  case AArch64::LD2Twov16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Twov16b;
+  case AArch64::LD2Twov4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Twov4h;
+  case AArch64::LD2Twov8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Twov8h;
+  case AArch64::LD2Twov2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Twov2s;
+  case AArch64::LD2Twov4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Twov4s;
+  case AArch64::LD2Twov2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD2Twov2d;
+  case AArch64::ST2Twov8b_POST:
+    IsNoOffset = true;
+    return AArch64::ST2Twov8b;
+  case AArch64::ST2Twov16b_POST:
+    IsNoOffset = true;
+    return AArch64::ST2Twov16b;
+  case AArch64::ST2Twov4h_POST:
+    IsNoOffset = true;
+    return AArch64::ST2Twov4h;
+  case AArch64::ST2Twov8h_POST:
+    IsNoOffset = true;
+    return AArch64::ST2Twov8h;
+  case AArch64::ST2Twov2s_POST:
+    IsNoOffset = true;
+    return AArch64::ST2Twov2s;
+  case AArch64::ST2Twov4s_POST:
+    IsNoOffset = true;
+    return AArch64::ST2Twov4s;
+  case AArch64::ST2Twov2d_POST:
+    IsNoOffset = true;
+    return AArch64::ST2Twov2d;
+  // LD3/ST3 multiple structures.
+  case AArch64::LD3Threev8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Threev8b;
+  case AArch64::LD3Threev16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Threev16b;
+  case AArch64::LD3Threev4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Threev4h;
+  case AArch64::LD3Threev8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Threev8h;
+  case AArch64::LD3Threev2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Threev2s;
+  case AArch64::LD3Threev4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Threev4s;
+  case AArch64::LD3Threev2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD3Threev2d;
+  case AArch64::ST3Threev8b_POST:
+    IsNoOffset = true;
+    return AArch64::ST3Threev8b;
+  case AArch64::ST3Threev16b_POST:
+    IsNoOffset = true;
+    return AArch64::ST3Threev16b;
+  case AArch64::ST3Threev4h_POST:
+    IsNoOffset = true;
+    return AArch64::ST3Threev4h;
+  case AArch64::ST3Threev8h_POST:
+    IsNoOffset = true;
+    return AArch64::ST3Threev8h;
+  case AArch64::ST3Threev2s_POST:
+    IsNoOffset = true;
+    return AArch64::ST3Threev2s;
+  case AArch64::ST3Threev4s_POST:
+    IsNoOffset = true;
+    return AArch64::ST3Threev4s;
+  case AArch64::ST3Threev2d_POST:
+    IsNoOffset = true;
+    return AArch64::ST3Threev2d;
+  // LD4/ST4 multiple structures.
+  case AArch64::LD4Fourv8b_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Fourv8b;
+  case AArch64::LD4Fourv16b_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Fourv16b;
+  case AArch64::LD4Fourv4h_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Fourv4h;
+  case AArch64::LD4Fourv8h_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Fourv8h;
+  case AArch64::LD4Fourv2s_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Fourv2s;
+  case AArch64::LD4Fourv4s_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Fourv4s;
+  case AArch64::LD4Fourv2d_POST:
+    IsNoOffset = true;
+    return AArch64::LD4Fourv2d;
+  case AArch64::ST4Fourv8b_POST:
+    IsNoOffset = true;
+    return AArch64::ST4Fourv8b;
+  case AArch64::ST4Fourv16b_POST:
+    IsNoOffset = true;
+    return AArch64::ST4Fourv16b;
+  case AArch64::ST4Fourv4h_POST:
+    IsNoOffset = true;
+    return AArch64::ST4Fourv4h;
+  case AArch64::ST4Fourv8h_POST:
+    IsNoOffset = true;
+    return AArch64::ST4Fourv8h;
+  case AArch64::ST4Fourv2s_POST:
+    IsNoOffset = true;
+    return AArch64::ST4Fourv2s;
+  case AArch64::ST4Fourv4s_POST:
+    IsNoOffset = true;
+    return AArch64::ST4Fourv4s;
+  case AArch64::ST4Fourv2d_POST:
+    IsNoOffset = true;
+    return AArch64::ST4Fourv2d;
+  default:
+    return AArch64::INSTRUCTION_LIST_END;
+  }
+}
+
+// Get the natural offset for SIMD post-index instructions.
+// These instructions have XZR as the register operand when using the
+// natural (implicit) offset.
+int getSIMDNaturalOffset(unsigned Op) {
+  switch (Op) {
+  // LD1/ST1 single structure.
+  case AArch64::LD1i8_POST:
+  case AArch64::ST1i8_POST:
+    return 1;
+  case AArch64::LD1i16_POST:
+  case AArch64::ST1i16_POST:
+    return 2;
+  case AArch64::LD1i32_POST:
+  case AArch64::ST1i32_POST:
+    return 4;
+  case AArch64::LD1i64_POST:
+  case AArch64::ST1i64_POST:
+    return 8;
+  // LD2/ST2 single structure.
+  case AArch64::LD2i8_POST:
+  case AArch64::ST2i8_POST:
+    return 2;
+  case AArch64::LD2i16_POST:
+  case AArch64::ST2i16_POST:
+    return 4;
+  case AArch64::LD2i32_POST:
+  case AArch64::ST2i32_POST:
+    return 8;
+  case AArch64::LD2i64_POST:
+  case AArch64::ST2i64_POST:
+    return 16;
+  // LD3/ST3 single structure.
+  case AArch64::LD3i8_POST:
+  case AArch64::ST3i8_POST:
+    return 3;
+  case AArch64::LD3i16_POST:
+  case AArch64::ST3i16_POST:
+    return 6;
+  case AArch64::LD3i32_POST:
+  case AArch64::ST3i32_POST:
+    return 12;
+  case AArch64::LD3i64_POST:
+  case AArch64::ST3i64_POST:
+    return 24;
+  // LD4/ST4 single structure.
+  case AArch64::LD4i8_POST:
+  case AArch64::ST4i8_POST:
+    return 4;
+  case AArch64::LD4i16_POST:
+  case AArch64::ST4i16_POST:
+    return 8;
+  case AArch64::LD4i32_POST:
+  case AArch64::ST4i32_POST:
+    return 16;
+  case AArch64::LD4i64_POST:
+  case AArch64::ST4i64_POST:
+    return 32;
+  // LD1R.
+  case AArch64::LD1Rv8b_POST:
+  case AArch64::LD1Rv16b_POST:
+    return 1;
+  case AArch64::LD1Rv4h_POST:
+  case AArch64::LD1Rv8h_POST:
+    return 2;
+  case AArch64::LD1Rv2s_POST:
+  case AArch64::LD1Rv4s_POST:
+    return 4;
+  case AArch64::LD1Rv1d_POST:
+  case AArch64::LD1Rv2d_POST:
+    return 8;
+  // LD2R.
+  case AArch64::LD2Rv8b_POST:
+  case AArch64::LD2Rv16b_POST:
+    return 2;
+  case AArch64::LD2Rv4h_POST:
+  case AArch64::LD2Rv8h_POST:
+    return 4;
+  case AArch64::LD2Rv2s_POST:
+  case AArch64::LD2Rv4s_POST:
+    return 8;
+  case AArch64::LD2Rv1d_POST:
+  case AArch64::LD2Rv2d_POST:
+    return 16;
+  // LD3R.
+  case AArch64::LD3Rv8b_POST:
+  case AArch64::LD3Rv16b_POST:
+    return 3;
+  case AArch64::LD3Rv4h_POST:
+  case AArch64::LD3Rv8h_POST:
+    return 6;
+  case AArch64::LD3Rv2s_POST:
+  case AArch64::LD3Rv4s_POST:
+    return 12;
+  case AArch64::LD3Rv1d_POST:
+  case AArch64::LD3Rv2d_POST:
+    return 24;
+  // LD4R.
+  case AArch64::LD4Rv8b_POST:
+  case AArch64::LD4Rv16b_POST:
+    return 4;
+  case AArch64::LD4Rv4h_POST:
+  case AArch64::LD4Rv8h_POST:
+    return 8;
+  case AArch64::LD4Rv2s_POST:
+  case AArch64::LD4Rv4s_POST:
+    return 16;
+  case AArch64::LD4Rv1d_POST:
+  case AArch64::LD4Rv2d_POST:
+    return 32;
+  // LD1/ST1 multiple structures (8b).
+  case AArch64::LD1Onev8b_POST:
+  case AArch64::ST1Onev8b_POST:
+    return 8;
+  case AArch64::LD1Twov8b_POST:
+  case AArch64::ST1Twov8b_POST:
+    return 16;
+  case AArch64::LD1Threev8b_POST:
+  case AArch64::ST1Threev8b_POST:
+    return 24;
+  case AArch64::LD1Fourv8b_POST:
+  case AArch64::ST1Fourv8b_POST:
+    return 32;
+  // LD1/ST1 multiple structures (16b).
+  case AArch64::LD1Onev16b_POST:
+  case AArch64::ST1Onev16b_POST:
+    return 16;
+  case AArch64::LD1Twov16b_POST:
+  case AArch64::ST1Twov16b_POST:
+    return 32;
+  case AArch64::LD1Threev16b_POST:
+  case AArch64::ST1Threev16b_POST:
+    return 48;
+  case AArch64::LD1Fourv16b_POST:
+  case AArch64::ST1Fourv16b_POST:
+    return 64;
+  // LD1/ST1 multiple structures (4h).
+  case AArch64::LD1Onev4h_POST:
+  case AArch64::ST1Onev4h_POST:
+    return 8;
+  case AArch64::LD1Twov4h_POST:
+  case AArch64::ST1Twov4h_POST:
+    return 16;
+  case AArch64::LD1Threev4h_POST:
+  case AArch64::ST1Threev4h_POST:
+    return 24;
+  case AArch64::LD1Fourv4h_POST:
+  case AArch64::ST1Fourv4h_POST:
+    return 32;
+  // LD1/ST1 multiple structures (8h).
+  case AArch64::LD1Onev8h_POST:
+  case AArch64::ST1Onev8h_POST:
+    return 16;
+  case AArch64::LD1Twov8h_POST:
+  case AArch64::ST1Twov8h_POST:
+    return 32;
+  case AArch64::LD1Threev8h_POST:
+  case AArch64::ST1Threev8h_POST:
+    return 48;
+  case AArch64::LD1Fourv8h_POST:
+  case AArch64::ST1Fourv8h_POST:
+    return 64;
+  // LD1/ST1 multiple structures (2s).
+  case AArch64::LD1Onev2s_POST:
+  case AArch64::ST1Onev2s_POST:
+    return 8;
+  case AArch64::LD1Twov2s_POST:
+  case AArch64::ST1Twov2s_POST:
+    return 16;
+  case AArch64::LD1Threev2s_POST:
+  case AArch64::ST1Threev2s_POST:
+    return 24;
+  case AArch64::LD1Fourv2s_POST:
+  case AArch64::ST1Fourv2s_POST:
+    return 32;
+  // LD1/ST1 multiple structures (4s).
+  case AArch64::LD1Onev4s_POST:
+  case AArch64::ST1Onev4s_POST:
+    return 16;
+  case AArch64::LD1Twov4s_POST:
+  case AArch64::ST1Twov4s_POST:
+    return 32;
+  case AArch64::LD1Threev4s_POST:
+  case AArch64::ST1Threev4s_POST:
+    return 48;
+  case AArch64::LD1Fourv4s_POST:
+  case AArch64::ST1Fourv4s_POST:
+    return 64;
+  // LD1/ST1 multiple structures (1d).
+  case AArch64::LD1Onev1d_POST:
+  case AArch64::ST1Onev1d_POST:
+    return 8;
+  case AArch64::LD1Twov1d_POST:
+  case AArch64::ST1Twov1d_POST:
+    return 16;
+  case AArch64::LD1Threev1d_POST:
+  case AArch64::ST1Threev1d_POST:
+    return 24;
+  case AArch64::LD1Fourv1d_POST:
+  case AArch64::ST1Fourv1d_POST:
+    return 32;
+  // LD1/ST1 multiple structures (2d).
+  case AArch64::LD1Onev2d_POST:
+  case AArch64::ST1Onev2d_POST:
+    return 16;
+  case AArch64::LD1Twov2d_POST:
+  case AArch64::ST1Twov2d_POST:
+    return 32;
+  case AArch64::LD1Threev2d_POST:
+  case AArch64::ST1Threev2d_POST:
+    return 48;
+  case AArch64::LD1Fourv2d_POST:
+  case AArch64::ST1Fourv2d_POST:
+    return 64;
+  // LD2/ST2 multiple structures.
+  case AArch64::LD2Twov8b_POST:
+  case AArch64::ST2Twov8b_POST:
+    return 16;
+  case AArch64::LD2Twov16b_POST:
+  case AArch64::ST2Twov16b_POST:
+    return 32;
+  case AArch64::LD2Twov4h_POST:
+  case AArch64::ST2Twov4h_POST:
+    return 16;
+  case AArch64::LD2Twov8h_POST:
+  case AArch64::ST2Twov8h_POST:
+    return 32;
+  case AArch64::LD2Twov2s_POST:
+  case AArch64::ST2Twov2s_POST:
+    return 16;
+  case AArch64::LD2Twov4s_POST:
+  case AArch64::ST2Twov4s_POST:
+    return 32;
+  case AArch64::LD2Twov2d_POST:
+  case AArch64::ST2Twov2d_POST:
+    return 32;
+  // LD3/ST3 multiple structures.
+  case AArch64::LD3Threev8b_POST:
+  case AArch64::ST3Threev8b_POST:
+    return 24;
+  case AArch64::LD3Threev16b_POST:
+  case AArch64::ST3Threev16b_POST:
+    return 48;
+  case AArch64::LD3Threev4h_POST:
+  case AArch64::ST3Threev4h_POST:
+    return 24;
+  case AArch64::LD3Threev8h_POST:
+  case AArch64::ST3Threev8h_POST:
+    return 48;
+  case AArch64::LD3Threev2s_POST:
+  case AArch64::ST3Threev2s_POST:
+    return 24;
+  case AArch64::LD3Threev4s_POST:
+  case AArch64::ST3Threev4s_POST:
+    return 48;
+  case AArch64::LD3Threev2d_POST:
+  case AArch64::ST3Threev2d_POST:
+    return 48;
+  // LD4/ST4 multiple structures.
+  case AArch64::LD4Fourv8b_POST:
+  case AArch64::ST4Fourv8b_POST:
+    return 32;
+  case AArch64::LD4Fourv16b_POST:
+  case AArch64::ST4Fourv16b_POST:
+    return 64;
+  case AArch64::LD4Fourv4h_POST:
+  case AArch64::ST4Fourv4h_POST:
+    return 32;
+  case AArch64::LD4Fourv8h_POST:
+  case AArch64::ST4Fourv8h_POST:
+    return 64;
+  case AArch64::LD4Fourv2s_POST:
+  case AArch64::ST4Fourv2s_POST:
+    return 32;
+  case AArch64::LD4Fourv4s_POST:
+  case AArch64::ST4Fourv4s_POST:
+    return 64;
+  case AArch64::LD4Fourv2d_POST:
+  case AArch64::ST4Fourv2d_POST:
+    return 64;
+  default:
+    return -1;
+  }
+}
+
+} // anonymous namespace
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.h b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.h
new file mode 100644
index 0000000000000..1e3010643eafe
--- /dev/null
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.h
@@ -0,0 +1,144 @@
+//===- AArch64MCLFIRewriter.h -----------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file declares the AArch64MCLFIRewriter class, the AArch64 specific
+// subclass of MCLFIRewriter.
+//
+//===----------------------------------------------------------------------===//
+#ifndef LLVM_LIB_TARGET_AARCH64_MCTARGETDESC_AARCH64MCLFIREWRITER_H
+#define LLVM_LIB_TARGET_AARCH64_MCTARGETDESC_AARCH64MCLFIREWRITER_H
+
+#include "AArch64AddressingModes.h"
+#include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCLFIRewriter.h"
+#include "llvm/MC/MCRegister.h"
+#include "llvm/MC/MCRegisterInfo.h"
+
+namespace llvm {
+class MCContext;
+class MCInst;
+class MCOperand;
+class MCStreamer;
+class MCSubtargetInfo;
+
+/// Rewrites AArch64 instructions for LFI sandboxing.
+///
+/// This class implements the LFI (Lightweight Fault Isolation) rewriting
+/// for AArch64 instructions. It transforms instructions to ensure memory
+/// accesses and control flow are confined within the sandbox region.
+///
+/// Reserved registers:
+/// - X27: Sandbox base address (always holds the base)
+/// - X28: Safe address register (always within sandbox)
+/// - X26: Scratch register for intermediate calculations
+/// - X25: context register (points to thread-local runtime data)
+/// - SP:  Stack pointer (always within sandbox)
+/// - X30: Link register (always within sandbox)
+class AArch64MCLFIRewriter : public MCLFIRewriter {
+public:
+  AArch64MCLFIRewriter(MCContext &Ctx, std::unique_ptr<MCRegisterInfo> &&RI,
+                       std::unique_ptr<MCInstrInfo> &&II)
+      : MCLFIRewriter(Ctx, std::move(RI), std::move(II)) {}
+
+  bool rewriteInst(const MCInst &Inst, MCStreamer &Out,
+                   const MCSubtargetInfo &STI) override;
+
+  void onLabel(const MCSymbol *Symbol, MCStreamer &Out) override;
+  void finish(MCStreamer &Out) override;
+
+private:
+  /// Recursion guard to prevent infinite loops when emitting instructions.
+  bool Guard = false;
+
+  /// Deferred LR guard: emitted before the next control flow instruction.
+  /// This allows PAC-compatible code to work correctly by guarding x30 into
+  /// x30 only when needed for control flow.
+  bool DeferredLRGuard = false;
+
+  /// Last seen MCSubtargetInfo, used for deferred LR guard emission.
+  const MCSubtargetInfo *LastSTI = nullptr;
+
+  /// Guard elimination state: tracks which register x28 was last guarded with.
+  /// Reset when label emitted, control flow instruction processed, or guarded
+  /// register modified.
+  bool ActiveGuard = false;
+  MCRegister ActiveGuardReg;
+
+  // Instruction classification.
+  bool mayModifyStack(const MCInst &Inst) const;
+  bool mayModifyReserved(const MCInst &Inst) const;
+  bool mayModifyLR(const MCInst &Inst) const;
+
+  // Instruction emission.
+  void emitInst(const MCInst &Inst, MCStreamer &Out,
+                const MCSubtargetInfo &STI);
+  void emitAddMask(MCRegister Dest, MCRegister Src, MCStreamer &Out,
+                   const MCSubtargetInfo &STI);
+  void emitBranch(unsigned Opcode, MCRegister Target, MCStreamer &Out,
+                  const MCSubtargetInfo &STI);
+  void emitMov(MCRegister Dest, MCRegister Src, MCStreamer &Out,
+               const MCSubtargetInfo &STI);
+  void emitAddImm(MCRegister Dest, MCRegister Src, int64_t Imm, MCStreamer &Out,
+                  const MCSubtargetInfo &STI);
+  void emitAddReg(MCRegister Dest, MCRegister Src1, MCRegister Src2,
+                  unsigned Shift, MCStreamer &Out, const MCSubtargetInfo &STI);
+  void emitAddRegExtend(MCRegister Dest, MCRegister Src1, MCRegister Src2,
+                        AArch64_AM::ShiftExtendType ExtType, unsigned Shift,
+                        MCStreamer &Out, const MCSubtargetInfo &STI);
+  void emitMemRoW(unsigned Opcode, const MCOperand &DataOp, MCRegister BaseReg,
+                  MCStreamer &Out, const MCSubtargetInfo &STI);
+
+  // Rewriting logic.
+  void doRewriteInst(const MCInst &Inst, MCStreamer &Out,
+                     const MCSubtargetInfo &STI);
+
+  // Control flow.
+  void rewriteIndirectBranch(const MCInst &Inst, MCStreamer &Out,
+                             const MCSubtargetInfo &STI);
+  void rewriteCall(const MCInst &Inst, MCStreamer &Out,
+                   const MCSubtargetInfo &STI);
+  void rewriteReturn(const MCInst &Inst, MCStreamer &Out,
+                     const MCSubtargetInfo &STI);
+
+  // Memory access.
+  void rewriteLoadStore(const MCInst &Inst, MCStreamer &Out,
+                        const MCSubtargetInfo &STI);
+  void rewriteLoadStoreBasic(const MCInst &Inst, MCStreamer &Out,
+                             const MCSubtargetInfo &STI);
+  bool rewriteLoadStoreRoW(const MCInst &Inst, MCStreamer &Out,
+                           const MCSubtargetInfo &STI);
+
+  // Register modification.
+  void rewriteStackModification(const MCInst &Inst, MCStreamer &Out,
+                                const MCSubtargetInfo &STI);
+  void rewriteLRModification(const MCInst &Inst, MCStreamer &Out,
+                             const MCSubtargetInfo &STI);
+
+  // PAC (Pointer Authentication Code) instructions.
+  void rewriteAuthenticatedReturn(const MCInst &Inst, MCStreamer &Out,
+                                  const MCSubtargetInfo &STI);
+  void rewriteAuthenticatedBranchOrCall(const MCInst &Inst,
+                                        unsigned BranchOpcode, MCStreamer &Out,
+                                        const MCSubtargetInfo &STI);
+
+  // System instructions.
+  void rewriteSyscall(const MCInst &Inst, MCStreamer &Out,
+                      const MCSubtargetInfo &STI);
+  void rewriteTLSRead(const MCInst &Inst, MCStreamer &Out,
+                      const MCSubtargetInfo &STI);
+  void rewriteTLSWrite(const MCInst &Inst, MCStreamer &Out,
+                       const MCSubtargetInfo &STI);
+  void rewriteDCZVA(const MCInst &Inst, MCStreamer &Out,
+                    const MCSubtargetInfo &STI);
+
+  void emitSyscall(MCStreamer &Out, const MCSubtargetInfo &STI);
+};
+
+} // namespace llvm
+
+#endif // LLVM_LIB_TARGET_AARCH64_MCTARGETDESC_AARCH64MCLFIREWRITER_H
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp
index 5c8f57664a2cc..d681e75b314b7 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp
@@ -13,6 +13,7 @@
 #include "AArch64MCTargetDesc.h"
 #include "AArch64ELFStreamer.h"
 #include "AArch64MCAsmInfo.h"
+#include "AArch64MCLFIRewriter.h"
 #include "AArch64WinCOFFStreamer.h"
 #include "MCTargetDesc/AArch64AddressingModes.h"
 #include "MCTargetDesc/AArch64InstPrinter.h"
@@ -503,6 +504,17 @@ static MCInstrAnalysis *createAArch64InstrAnalysis(const MCInstrInfo *Info) {
   return new AArch64MCInstrAnalysis(Info);
 }
 
+static MCLFIRewriter *
+createAArch64MCLFIRewriter(MCStreamer &S,
+                           std::unique_ptr<MCRegisterInfo> &&RegInfo,
+                           std::unique_ptr<MCInstrInfo> &&InstInfo) {
+  auto RW = std::make_unique<AArch64MCLFIRewriter>(
+      S.getContext(), std::move(RegInfo), std::move(InstInfo));
+  auto *Ptr = RW.get();
+  S.setLFIRewriter(std::move(RW));
+  return Ptr;
+}
+
 // Force static initialization.
 extern "C" LLVM_ABI LLVM_EXTERNAL_VISIBILITY void
 LLVMInitializeAArch64TargetMC() {
@@ -532,6 +544,9 @@ LLVMInitializeAArch64TargetMC() {
     TargetRegistry::RegisterMachOStreamer(*T, createMachOStreamer);
     TargetRegistry::RegisterCOFFStreamer(*T, createAArch64WinCOFFStreamer);
 
+    // Register the LFI rewriter.
+    TargetRegistry::RegisterMCLFIRewriter(*T, createAArch64MCLFIRewriter);
+
     // Register the obj target streamer.
     TargetRegistry::RegisterObjectTargetStreamer(
         *T, createAArch64ObjectTargetStreamer);
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/CMakeLists.txt b/llvm/lib/Target/AArch64/MCTargetDesc/CMakeLists.txt
index 7f220657e45f8..7d8d825d7220b 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/CMakeLists.txt
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/CMakeLists.txt
@@ -6,6 +6,7 @@ add_llvm_component_library(LLVMAArch64Desc
   AArch64MCAsmInfo.cpp
   AArch64MCCodeEmitter.cpp
   AArch64MCExpr.cpp
+  AArch64MCLFIRewriter.cpp
   AArch64MCTargetDesc.cpp
   AArch64MachObjectWriter.cpp
   AArch64TargetStreamer.cpp
diff --git a/llvm/lib/Target/AArch64/SMEInstrFormats.td b/llvm/lib/Target/AArch64/SMEInstrFormats.td
index 99836aeed7c0a..edf2e37b78e4c 100644
--- a/llvm/lib/Target/AArch64/SMEInstrFormats.td
+++ b/llvm/lib/Target/AArch64/SMEInstrFormats.td
@@ -1113,6 +1113,9 @@ class sme_spill_fill_base<bit isStore, dag outs, dag ins, string opcodestr>
   let Inst{9-5}   = Rn;
   let Inst{4}     = 0b0;
   let Inst{3-0}   = imm4;
+  let MemOpAddrMode = MemOpAddrModeIndexed;
+  let MemOpBaseIdx = 3;
+  let MemOpOffsetIdx = 4;
 }
 
 let mayStore = 1 in
@@ -3685,6 +3688,8 @@ class sme2_spill_fill_vector<string mnemonic, bits<8> opc>
 
   let mayLoad     = !not(opc{7});
   let mayStore    = opc{7};
+  let MemOpAddrMode = MemOpAddrModeNoIdx;
+  let MemOpBaseIdx = 1;
 }
 
 
diff --git a/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h b/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
index 97777dec45863..2a5415365e86b 100644
--- a/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+++ b/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
@@ -1058,6 +1058,75 @@ namespace AArch64 {
 static constexpr unsigned SVEBitsPerBlock = 128;
 static constexpr unsigned SVEMaxBitsPerVector = 2048;
 } // end namespace AArch64
+
+// TSFlags layout for memory operation fields (bits 14-26).
+// See AArch64InstrInfo.h for the full TSFlags layout.
+#define TSFLAG_MEM_OP_ADDR_MODE(X)      ((X) << 14) // 5-bits
+#define TSFLAG_MEM_OP_BASE_IDX(X)       ((X) << 19) // 4-bits
+#define TSFLAG_MEM_OP_OFFSET_IDX(X)     ((X) << 23) // 4-bits
+
+namespace AArch64 {
+
+/// Memory operation addressing mode classification for load/store instructions.
+/// Used to identify operand layout for memory operations.
+enum MemOpAddrModeType {
+  MemOpAddrModeMask       = TSFLAG_MEM_OP_ADDR_MODE(0x1f),
+  MemOpAddrModeNone       = TSFLAG_MEM_OP_ADDR_MODE(0x0),  // Not a memory op
+  MemOpAddrModeIndexed    = TSFLAG_MEM_OP_ADDR_MODE(0x1),  // [Xn, #imm]
+  MemOpAddrModeUnscaled   = TSFLAG_MEM_OP_ADDR_MODE(0x2),  // [Xn, #simm]
+  MemOpAddrModePreIdx     = TSFLAG_MEM_OP_ADDR_MODE(0x3),  // [Xn, #imm]!
+  MemOpAddrModePostIdx    = TSFLAG_MEM_OP_ADDR_MODE(0x4),  // [Xn], #imm
+  MemOpAddrModeRegOff     = TSFLAG_MEM_OP_ADDR_MODE(0x5),  // [Xn, Xm, ext]
+  MemOpAddrModeLiteral    = TSFLAG_MEM_OP_ADDR_MODE(0x6),  // PC-relative
+  MemOpAddrModeNoIdx      = TSFLAG_MEM_OP_ADDR_MODE(0x7),  // [Xn] (no offset)
+  MemOpAddrModePair       = TSFLAG_MEM_OP_ADDR_MODE(0x8),  // LDP/STP [Xn, #imm]
+  MemOpAddrModePairPre    = TSFLAG_MEM_OP_ADDR_MODE(0x9),  // LDP/STP [Xn, #imm]!
+  MemOpAddrModePairPost   = TSFLAG_MEM_OP_ADDR_MODE(0xa),  // LDP/STP [Xn], #imm
+  MemOpAddrModePostIdxReg = TSFLAG_MEM_OP_ADDR_MODE(0xb),  // [Xn], Xm (SIMD)
+};
+
+/// Mask and shift for extracting the base register operand index.
+static const uint64_t MemOpBaseIdxMask = TSFLAG_MEM_OP_BASE_IDX(0xf);
+static const unsigned MemOpBaseIdxShift = 19;
+
+/// Mask and shift for extracting the offset operand index.
+static const uint64_t MemOpOffsetIdxMask = TSFLAG_MEM_OP_OFFSET_IDX(0xf);
+static const unsigned MemOpOffsetIdxShift = 23;
+
+/// Get the base register operand index for a memory instruction.
+/// Returns -1 if not a memory instruction or base cannot be determined.
+inline int getMemOpBaseRegIdx(uint64_t TSFlags) {
+  unsigned BaseIdx = (TSFlags & MemOpBaseIdxMask) >> MemOpBaseIdxShift;
+  return BaseIdx ? static_cast<int>(BaseIdx) : -1;
+}
+
+/// Get the offset operand index for a memory instruction.
+/// Returns -1 if there is no offset operand.
+inline int getMemOpOffsetIdx(uint64_t TSFlags) {
+  unsigned OffsetIdx = (TSFlags & MemOpOffsetIdxMask) >> MemOpOffsetIdxShift;
+  return OffsetIdx ? static_cast<int>(OffsetIdx) : -1;
+}
+
+/// Returns true if this is a memory operation with pre/post-index writeback.
+inline bool isMemOpPrePostIdx(uint64_t TSFlags) {
+  switch (TSFlags & MemOpAddrModeMask) {
+  case MemOpAddrModePreIdx:
+  case MemOpAddrModePostIdx:
+  case MemOpAddrModePairPre:
+  case MemOpAddrModePairPost:
+  case MemOpAddrModePostIdxReg:
+    return true;
+  default:
+    return false;
+  }
+}
+
+} // end namespace AArch64
+
+#undef TSFLAG_MEM_OP_ADDR_MODE
+#undef TSFLAG_MEM_OP_BASE_IDX
+#undef TSFLAG_MEM_OP_OFFSET_IDX
+
 } // end namespace llvm
 
 #endif
diff --git a/llvm/test/MC/AArch64/LFI/branch.s b/llvm/test/MC/AArch64/LFI/branch.s
new file mode 100644
index 0000000000000..38504fe1181ac
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/branch.s
@@ -0,0 +1,20 @@
+// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s
+
+foo:
+  b foo
+// CHECK: b foo
+
+  br x0
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: br x28
+
+  blr x0
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: blr x28
+
+  ret
+// CHECK: ret
+
+  ret x0
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ret x28
diff --git a/llvm/test/MC/AArch64/LFI/exclusive.s b/llvm/test/MC/AArch64/LFI/exclusive.s
new file mode 100644
index 0000000000000..5f56c5a2271b5
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/exclusive.s
@@ -0,0 +1,140 @@
+// RUN: llvm-mc -triple aarch64_lfi --aarch64-lfi-no-guard-elim %s | FileCheck %s
+
+// Load exclusive
+ldxr x0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldxr x0, [x28]
+
+ldxr w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldxr w0, [x28]
+
+ldxrb w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldxrb w0, [x28]
+
+ldxrh w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldxrh w0, [x28]
+
+// Store exclusive
+stxr w0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stxr w0, x1, [x28]
+
+stxr w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stxr w0, w1, [x28]
+
+stxrb w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stxrb w0, w1, [x28]
+
+stxrh w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stxrh w0, w1, [x28]
+
+// Load-acquire exclusive
+ldaxr x0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldaxr x0, [x28]
+
+ldaxr w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldaxr w0, [x28]
+
+ldaxrb w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldaxrb w0, [x28]
+
+ldaxrh w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldaxrh w0, [x28]
+
+// Store-release exclusive
+stlxr w0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stlxr w0, x1, [x28]
+
+stlxr w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stlxr w0, w1, [x28]
+
+stlxrb w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stlxrb w0, w1, [x28]
+
+stlxrh w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stlxrh w0, w1, [x28]
+
+// Exclusive pairs
+ldxp x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldxp x0, x1, [x28]
+
+ldxp w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldxp w0, w1, [x28]
+
+stxp w0, x1, x2, [x3]
+// CHECK:      add x28, x27, w3, uxtw
+// CHECK-NEXT: stxp w0, x1, x2, [x28]
+
+stxp w0, w1, w2, [x3]
+// CHECK:      add x28, x27, w3, uxtw
+// CHECK-NEXT: stxp w0, w1, w2, [x28]
+
+ldaxp x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldaxp x0, x1, [x28]
+
+stlxp w0, x1, x2, [x3]
+// CHECK:      add x28, x27, w3, uxtw
+// CHECK-NEXT: stlxp w0, x1, x2, [x28]
+
+// Load-acquire / Store-release (non-exclusive)
+ldar x0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldar x0, [x28]
+
+ldar w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldar w0, [x28]
+
+ldarb w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldarb w0, [x28]
+
+ldarh w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldarh w0, [x28]
+
+stlr x0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stlr x0, [x28]
+
+stlr w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stlr w0, [x28]
+
+stlrb w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stlrb w0, [x28]
+
+stlrh w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stlrh w0, [x28]
+
+// SP-relative exclusive (no sandboxing needed)
+ldxr x0, [sp]
+// CHECK: ldxr x0, [sp]
+
+stxr w0, x1, [sp]
+// CHECK: stxr w0, x1, [sp]
+
+ldar x0, [sp]
+// CHECK: ldar x0, [sp]
+
+stlr x0, [sp]
+// CHECK: stlr x0, [sp]
diff --git a/llvm/test/MC/AArch64/LFI/fp.s b/llvm/test/MC/AArch64/LFI/fp.s
new file mode 100644
index 0000000000000..38d98af82d25e
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/fp.s
@@ -0,0 +1,204 @@
+// RUN: llvm-mc -triple aarch64_lfi --aarch64-lfi-no-guard-elim %s | FileCheck %s
+
+// FP/SIMD scalar loads (zero offset -> RoW)
+ldr b0, [x1]
+// CHECK: ldr b0, [x27, w1, uxtw]
+
+ldr h0, [x1]
+// CHECK: ldr h0, [x27, w1, uxtw]
+
+ldr s0, [x1]
+// CHECK: ldr s0, [x27, w1, uxtw]
+
+ldr d0, [x1]
+// CHECK: ldr d0, [x27, w1, uxtw]
+
+ldr q0, [x1]
+// CHECK: ldr q0, [x27, w1, uxtw]
+
+// FP/SIMD scalar stores (zero offset -> RoW)
+str b0, [x1]
+// CHECK: str b0, [x27, w1, uxtw]
+
+str h0, [x1]
+// CHECK: str h0, [x27, w1, uxtw]
+
+str s0, [x1]
+// CHECK: str s0, [x27, w1, uxtw]
+
+str d0, [x1]
+// CHECK: str d0, [x27, w1, uxtw]
+
+str q0, [x1]
+// CHECK: str q0, [x27, w1, uxtw]
+
+// FP loads with non-zero offset (demoted)
+ldr s0, [x1, #4]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldr s0, [x28, #4]
+
+ldr d0, [x1, #8]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldr d0, [x28, #8]
+
+ldr q0, [x1, #16]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldr q0, [x28, #16]
+
+// FP stores with non-zero offset (demoted)
+str s0, [x1, #4]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: str s0, [x28, #4]
+
+str d0, [x1, #8]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: str d0, [x28, #8]
+
+str q0, [x1, #16]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: str q0, [x28, #16]
+
+// FP pre-index
+ldr s0, [x1, #4]!
+// CHECK:      add x1, x1, #4
+// CHECK-NEXT: ldr s0, [x27, w1, uxtw]
+
+ldr d0, [x1, #8]!
+// CHECK:      add x1, x1, #8
+// CHECK-NEXT: ldr d0, [x27, w1, uxtw]
+
+ldr q0, [x1, #16]!
+// CHECK:      add x1, x1, #16
+// CHECK-NEXT: ldr q0, [x27, w1, uxtw]
+
+str s0, [x1, #4]!
+// CHECK:      add x1, x1, #4
+// CHECK-NEXT: str s0, [x27, w1, uxtw]
+
+// FP post-index
+ldr s0, [x1], #4
+// CHECK:      ldr s0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #4
+
+ldr d0, [x1], #8
+// CHECK:      ldr d0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #8
+
+ldr q0, [x1], #16
+// CHECK:      ldr q0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #16
+
+str d0, [x1], #8
+// CHECK:      str d0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #8
+
+// FP register offset
+ldr s0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: ldr s0, [x27, w26, uxtw]
+
+ldr d0, [x1, x2, lsl #3]
+// CHECK:      add x26, x1, x2, lsl #3
+// CHECK-NEXT: ldr d0, [x27, w26, uxtw]
+
+ldr q0, [x1, x2, lsl #4]
+// CHECK:      add x26, x1, x2, lsl #4
+// CHECK-NEXT: ldr q0, [x27, w26, uxtw]
+
+str s0, [x1, x2, lsl #2]
+// CHECK:      add x26, x1, x2, lsl #2
+// CHECK-NEXT: str s0, [x27, w26, uxtw]
+
+str d0, [x1, w2, sxtw]
+// CHECK:      add x26, x1, w2, sxtw
+// CHECK-NEXT: str d0, [x27, w26, uxtw]
+
+str q0, [x1, w2, uxtw #4]
+// CHECK:      add x26, x1, w2, uxtw #4
+// CHECK-NEXT: str q0, [x27, w26, uxtw]
+
+// FP unscaled offset
+ldur s0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldur s0, [x28]
+
+ldur d0, [x1, #8]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldur d0, [x28, #8]
+
+ldur q0, [x1, #-16]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldur q0, [x28, #-16]
+
+stur s0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stur s0, [x28]
+
+stur d0, [x1, #8]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stur d0, [x28, #8]
+
+// FP pair loads/stores
+ldp s0, s1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp s0, s1, [x28]
+
+ldp d0, d1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp d0, d1, [x28]
+
+ldp q0, q1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp q0, q1, [x28]
+
+stp s0, s1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp s0, s1, [x28]
+
+stp d0, d1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp d0, d1, [x28]
+
+stp q0, q1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp q0, q1, [x28]
+
+// FP pair with offset
+ldp s0, s1, [x2, #8]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp s0, s1, [x28, #8]
+
+ldp d0, d1, [x2, #16]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp d0, d1, [x28, #16]
+
+// FP pair pre/post-index
+ldp s0, s1, [x2], #8
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp s0, s1, [x28]
+// CHECK-NEXT: add x2, x2, #8
+
+ldp d0, d1, [x2, #16]!
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp d0, d1, [x28, #16]
+// CHECK-NEXT: add x2, x2, #16
+
+stp q0, q1, [x2], #32
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp q0, q1, [x28]
+// CHECK-NEXT: add x2, x2, #32
+
+stp d0, d1, [x2, #-16]!
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp d0, d1, [x28, #-16]
+// CHECK-NEXT: sub x2, x2, #16
+
+// SP-relative FP loads (no sandboxing needed)
+ldr s0, [sp]
+// CHECK: ldr s0, [sp]
+
+ldr d0, [sp, #8]
+// CHECK: ldr d0, [sp, #8]
+
+ldp q0, q1, [sp, #32]
+// CHECK: ldp q0, q1, [sp, #32]
diff --git a/llvm/test/MC/AArch64/LFI/guard-elim.s b/llvm/test/MC/AArch64/LFI/guard-elim.s
new file mode 100644
index 0000000000000..eb0c605479a2b
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/guard-elim.s
@@ -0,0 +1,149 @@
+// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s
+
+// Consecutive loads from same register share a guard.
+ldr x0, [x1, #8]
+ldr x2, [x1, #16]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldr x0, [x28, #8]
+// CHECK-NEXT: ldr x2, [x28, #16]
+
+// Modifying the base register invalidates the guard.
+ldr x4, [x3, #8]
+add x3, x3, #24
+ldr x5, [x3, #8]
+// CHECK:      add x28, x27, w3, uxtw
+// CHECK-NEXT: ldr x4, [x28, #8]
+// CHECK-NEXT: add x3, x3, #24
+// CHECK-NEXT: add x28, x27, w3, uxtw
+// CHECK-NEXT: ldr x5, [x28, #8]
+
+// Different base register requires new guard.
+ldr x6, [x4, #8]
+ldr x7, [x5, #8]
+// CHECK:      add x28, x27, w4, uxtw
+// CHECK-NEXT: ldr x6, [x28, #8]
+// CHECK-NEXT: add x28, x27, w5, uxtw
+// CHECK-NEXT: ldr x7, [x28, #8]
+
+// Labels invalidate the guard.
+label_boundary_test:
+ldr x8, [x6, #8]
+label1:
+ldr x9, [x6, #16]
+// CHECK-LABEL: label_boundary_test:
+// CHECK-NEXT: add x28, x27, w6, uxtw
+// CHECK-NEXT: ldr x8, [x28, #8]
+// CHECK-NEXT: label1:
+// CHECK-NEXT: add x28, x27, w6, uxtw
+// CHECK-NEXT: ldr x9, [x28, #16]
+
+// Branches invalidate the guard.
+control_flow_test:
+ldr x10, [x7, #8]
+b label2
+ldr x11, [x7, #16]
+label2:
+// CHECK-LABEL: control_flow_test:
+// CHECK-NEXT: add x28, x27, w7, uxtw
+// CHECK-NEXT: ldr x10, [x28, #8]
+// CHECK-NEXT: b label2
+// CHECK-NEXT: add x28, x27, w7, uxtw
+// CHECK-NEXT: ldr x11, [x28, #16]
+// CHECK-NEXT: label2:
+
+// W register modification invalidates X guard.
+w_reg_modification:
+ldr x12, [x8, #8]
+mov w8, #0
+ldr x13, [x8, #16]
+// CHECK-LABEL: w_reg_modification:
+// CHECK-NEXT: add x28, x27, w8, uxtw
+// CHECK-NEXT: ldr x12, [x28, #8]
+// CHECK-NEXT: mov w8, #0
+// CHECK-NEXT: add x28, x27, w8, uxtw
+// CHECK-NEXT: ldr x13, [x28, #16]
+
+// Multiple consecutive accesses share a guard.
+multiple_accesses:
+ldr x14, [x9, #8]
+ldr x15, [x9, #16]
+ldr x16, [x9, #24]
+str x17, [x9, #32]
+// CHECK-LABEL: multiple_accesses:
+// CHECK-NEXT: add x28, x27, w9, uxtw
+// CHECK-NEXT: ldr x14, [x28, #8]
+// CHECK-NEXT: ldr x15, [x28, #16]
+// CHECK-NEXT: ldr x16, [x28, #24]
+// CHECK-NEXT: str x17, [x28, #32]
+
+// Mixed loads and stores share a guard.
+mixed_load_store:
+str x18, [x10, #8]
+ldr x19, [x10, #16]
+str x20, [x10, #24]
+// CHECK-LABEL: mixed_load_store:
+// CHECK-NEXT: add x28, x27, w10, uxtw
+// CHECK-NEXT: str x18, [x28, #8]
+// CHECK-NEXT: ldr x19, [x28, #16]
+// CHECK-NEXT: str x20, [x28, #24]
+
+// Non-modifying instructions don't invalidate the guard.
+non_modifying_between:
+ldr x21, [x11, #8]
+mov x0, x1
+add x2, x3, x4
+ldr x22, [x11, #16]
+// CHECK-LABEL: non_modifying_between:
+// CHECK-NEXT: add x28, x27, w11, uxtw
+// CHECK-NEXT: ldr x21, [x28, #8]
+// CHECK-NEXT: mov x0, x1
+// CHECK-NEXT: add x2, x3, x4
+// CHECK-NEXT: ldr x22, [x28, #16]
+
+// Post-index pair invalidates the guard.
+prepost_ldp:
+    ldp x0, x1, [x2]
+    ldp x3, x4, [x2], #16
+    ldp x5, x6, [x2]
+// CHECK-LABEL: prepost_ldp:
+// CHECK-NEXT: add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x0, x1, [x28]
+// CHECK-NEXT: ldp x3, x4, [x28]
+// CHECK-NEXT: add x2, x2, #16
+// CHECK-NEXT: add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x5, x6, [x28]
+
+prepost_stp:
+    stp x0, x1, [x2]
+    stp x3, x4, [x2], #16
+    stp x5, x6, [x2]
+// CHECK-LABEL: prepost_stp:
+// CHECK-NEXT: add x28, x27, w2, uxtw
+// CHECK-NEXT: stp x0, x1, [x28]
+// CHECK-NEXT: stp x3, x4, [x28]
+// CHECK-NEXT: add x2, x2, #16
+// CHECK-NEXT: add x28, x27, w2, uxtw
+// CHECK-NEXT: stp x5, x6, [x28]
+
+// Pre-index pair invalidates the guard.
+prepost_ldp_pre:
+    ldp x0, x1, [x2]
+    ldp x3, x4, [x2, #16]!
+    ldp x5, x6, [x2]
+// CHECK-LABEL: prepost_ldp_pre:
+// CHECK-NEXT: add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x0, x1, [x28]
+// CHECK-NEXT: ldp x3, x4, [x28, #16]
+// CHECK-NEXT: add x2, x2, #16
+// CHECK-NEXT: add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x5, x6, [x28]
+
+// Load into the base register invalidates the guard.
+load_into_base:
+ldr x1, [x1, #8]
+ldr x2, [x1, #16]
+// CHECK-LABEL: load_into_base:
+// CHECK-NEXT: add x28, x27, w1, uxtw
+// CHECK-NEXT: ldr x1, [x28, #8]
+// CHECK-NEXT: add x28, x27, w1, uxtw
+// CHECK-NEXT: ldr x2, [x28, #16]
diff --git a/llvm/test/MC/AArch64/LFI/jumps-only.s b/llvm/test/MC/AArch64/LFI/jumps-only.s
new file mode 100644
index 0000000000000..162a203158559
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/jumps-only.s
@@ -0,0 +1,41 @@
+// RUN: llvm-mc -triple aarch64_lfi -mattr=+no-lfi-loads,+no-lfi-stores %s | FileCheck %s
+
+// Jumps-only mode: only branches are sandboxed.
+
+ldr x0, [x1]
+// CHECK: ldr x0, [x1]
+
+ldr x0, [x1, #8]
+// CHECK: ldr x0, [x1, #8]
+
+str x0, [x1]
+// CHECK: str x0, [x1]
+
+stp x0, x1, [x2, #16]
+// CHECK: stp x0, x1, [x2, #16]
+
+add sp, sp, #8
+// CHECK: add sp, sp, #8
+
+sub sp, sp, #8
+// CHECK: sub sp, sp, #8
+
+mov sp, x0
+// CHECK: mov sp, x0
+
+br x0
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: br x28
+
+ret
+// CHECK: ret
+
+blr x1
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: blr x28
+
+bl some_func
+// CHECK: bl some_func
+
+b some_func
+// CHECK: b some_func
diff --git a/llvm/test/MC/AArch64/LFI/literal.s b/llvm/test/MC/AArch64/LFI/literal.s
new file mode 100644
index 0000000000000..8a57ea0f4b619
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/literal.s
@@ -0,0 +1,32 @@
+// RUN: not llvm-mc -triple aarch64_lfi %s 2>&1 | FileCheck %s
+
+ldr x0, foo
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldr w0, bar
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldr s0, baz
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldr d0, qux
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldr q0, quux
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldrsw x0, signed_word
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+foo:
+  .quad 0
+bar:
+  .word 0
+baz:
+  .single 0.0
+qux:
+  .double 0.0
+quux:
+  .zero 16
+signed_word:
+  .word -1
diff --git a/llvm/test/MC/AArch64/LFI/lse.s b/llvm/test/MC/AArch64/LFI/lse.s
new file mode 100644
index 0000000000000..7f57eda757beb
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/lse.s
@@ -0,0 +1,166 @@
+// RUN: llvm-mc -triple aarch64_lfi --aarch64-lfi-no-guard-elim %s | FileCheck %s
+// RUN: llvm-mc -triple aarch64_lfi --aarch64-lfi-no-guard-elim -mattr=+no-lfi-loads %s | FileCheck %s --check-prefix=NOLOADS
+
+.arch_extension lse
+
+// LDADD variants
+// Atomics are both loads and stores, so +no-lfi-loads must still sandbox them.
+ldadd x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldadd x0, x1, [x28]
+// NOLOADS:      add x28, x27, w2, uxtw
+// NOLOADS-NEXT: ldadd x0, x1, [x28]
+
+ldadd w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldadd w0, w1, [x28]
+
+ldadda x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldadda x0, x1, [x28]
+
+ldaddal x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldaddal x0, x1, [x28]
+
+ldaddl x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldaddl x0, x1, [x28]
+
+ldaddab w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldaddab w0, w1, [x28]
+
+ldaddah w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldaddah w0, w1, [x28]
+
+// LDCLR variants
+ldclr x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldclr x0, x1, [x28]
+
+ldclra x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldclra x0, x1, [x28]
+
+ldclral x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldclral x0, x1, [x28]
+
+ldclrl x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldclrl x0, x1, [x28]
+
+// LDEOR variants
+ldeor x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldeor x0, x1, [x28]
+
+ldeora x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldeora x0, x1, [x28]
+
+// LDSET variants
+ldset x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldset x0, x1, [x28]
+
+ldseta x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldseta x0, x1, [x28]
+
+// SWP variants
+swp x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: swp x0, x1, [x28]
+// NOLOADS:      swp x0, x1, [x28]
+
+swpa x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: swpa x0, x1, [x28]
+
+swpal x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: swpal x0, x1, [x28]
+
+swpl x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: swpl x0, x1, [x28]
+
+swpab w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: swpab w0, w1, [x28]
+
+swpah w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: swpah w0, w1, [x28]
+
+swpalb w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: swpalb w0, w1, [x28]
+
+swpalh w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: swpalh w0, w1, [x28]
+
+swpal w0, w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: swpal w0, w0, [x28]
+
+// CAS variants
+cas x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: cas x0, x1, [x28]
+// NOLOADS:      cas x0, x1, [x28]
+
+casa x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: casa x0, x1, [x28]
+
+casal x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: casal x0, x1, [x28]
+
+casl x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: casl x0, x1, [x28]
+
+casab w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: casab w0, w1, [x28]
+
+casah w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: casah w0, w1, [x28]
+
+// CASP variants (pair)
+casp x0, x1, x2, x3, [x4]
+// CHECK:      add x28, x27, w4, uxtw
+// CHECK-NEXT: casp x0, x1, x2, x3, [x28]
+
+caspa x0, x1, x2, x3, [x4]
+// CHECK:      add x28, x27, w4, uxtw
+// CHECK-NEXT: caspa x0, x1, x2, x3, [x28]
+
+caspal x0, x1, x2, x3, [x4]
+// CHECK:      add x28, x27, w4, uxtw
+// CHECK-NEXT: caspal x0, x1, x2, x3, [x28]
+
+caspl x0, x1, x2, x3, [x4]
+// CHECK:      add x28, x27, w4, uxtw
+// CHECK-NEXT: caspl x0, x1, x2, x3, [x28]
+
+caspal w0, w1, w2, w3, [x4]
+// CHECK:      add x28, x27, w4, uxtw
+// CHECK-NEXT: caspal w0, w1, w2, w3, [x28]
+
+// SP-relative atomics (no sandboxing needed)
+ldadd x0, x1, [sp]
+// CHECK: ldadd x0, x1, [sp]
+
+swp x0, x1, [sp]
+// CHECK: swp x0, x1, [sp]
+
+cas x0, x1, [sp]
+// CHECK: cas x0, x1, [sp]
diff --git a/llvm/test/MC/AArch64/LFI/mem.s b/llvm/test/MC/AArch64/LFI/mem.s
new file mode 100644
index 0000000000000..8d7764f2b6643
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/mem.s
@@ -0,0 +1,437 @@
+// RUN: llvm-mc -triple aarch64_lfi --aarch64-lfi-no-guard-elim %s | FileCheck %s
+
+ldr x0, [sp]
+// CHECK: ldr x0, [sp]
+
+ldr x0, [sp, #8]
+// CHECK: ldr x0, [sp, #8]
+
+ldp x0, x1, [sp, #8]
+// CHECK: ldp x0, x1, [sp, #8]
+
+str x0, [sp]
+// CHECK: str x0, [sp]
+
+str x0, [sp, #8]
+// CHECK: str x0, [sp, #8]
+
+stp x0, x1, [sp, #8]
+// CHECK: stp x0, x1, [sp, #8]
+
+ldur x0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldur x0, [x28]
+
+stur x0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stur x0, [x28]
+
+ldp x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x0, x1, [x28]
+
+stp x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp x0, x1, [x28]
+
+ldr x0, [x1]
+// CHECK: ldr x0, [x27, w1, uxtw]
+
+ldr x0, [x1, #8]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldr x0, [x28, #8]
+
+ldr x0, [x1, #8]!
+// CHECK:      add x1, x1, #8
+// CHECK-NEXT: ldr x0, [x27, w1, uxtw]
+
+str x0, [x1, #8]!
+// CHECK:      add x1, x1, #8
+// CHECK-NEXT: str x0, [x27, w1, uxtw]
+
+ldr x0, [x1, #-8]!
+// CHECK:      sub x1, x1, #8
+// CHECK-NEXT: ldr x0, [x27, w1, uxtw]
+
+str x0, [x1, #-8]!
+// CHECK:      sub x1, x1, #8
+// CHECK-NEXT: str x0, [x27, w1, uxtw]
+
+ldr x0, [x1], #8
+// CHECK:      ldr x0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #8
+
+str x0, [x1], #8
+// CHECK:      str x0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #8
+
+ldr x0, [x1], #-8
+// CHECK:      ldr x0, [x27, w1, uxtw]
+// CHECK-NEXT: sub x1, x1, #8
+
+str x0, [x1], #-8
+// CHECK:      str x0, [x27, w1, uxtw]
+// CHECK-NEXT: sub x1, x1, #8
+
+ldr x0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: ldr x0, [x27, w26, uxtw]
+
+ldr x0, [x1, x2, lsl #3]
+// CHECK:      add x26, x1, x2, lsl #3
+// CHECK-NEXT: ldr x0, [x27, w26, uxtw]
+
+ldr x0, [x1, x2, sxtx #0]
+// CHECK:      add x26, x1, x2, sxtx
+// CHECK-NEXT: ldr x0, [x27, w26, uxtw]
+
+ldr x0, [x1, x2, sxtx #3]
+// CHECK:      add x26, x1, x2, sxtx #3
+// CHECK-NEXT: ldr x0, [x27, w26, uxtw]
+
+ldr x0, [x1, w2, uxtw]
+// CHECK:      add x26, x1, w2, uxtw
+// CHECK-NEXT: ldr x0, [x27, w26, uxtw]
+
+ldr x0, [x1, w2, uxtw #3]
+// CHECK:      add x26, x1, w2, uxtw #3
+// CHECK-NEXT: ldr x0, [x27, w26, uxtw]
+
+ldr x0, [x1, w2, sxtw]
+// CHECK:      add x26, x1, w2, sxtw
+// CHECK-NEXT: ldr x0, [x27, w26, uxtw]
+
+ldr x0, [x1, w2, sxtw #3]
+// CHECK:      add x26, x1, w2, sxtw #3
+// CHECK-NEXT: ldr x0, [x27, w26, uxtw]
+
+ldp x0, x1, [sp], #8
+// CHECK: ldp x0, x1, [sp], #8
+
+ldp x0, x1, [x2], #8
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x0, x1, [x28]
+// CHECK-NEXT: add x2, x2, #8
+
+ldp x0, x1, [x2, #8]!
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x0, x1, [x28, #8]
+// CHECK-NEXT: add x2, x2, #8
+
+ldp x0, x1, [x2], #-8
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x0, x1, [x28]
+// CHECK-NEXT: sub x2, x2, #8
+
+ldp x0, x1, [x2, #-8]!
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp x0, x1, [x28, #-8]
+// CHECK-NEXT: sub x2, x2, #8
+
+stp x0, x1, [x2, #-8]!
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp x0, x1, [x28, #-8]
+// CHECK-NEXT: sub x2, x2, #8
+
+ld3 { v0.4s, v1.4s, v2.4s }, [x0], #48
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.4s, v1.4s, v2.4s }, [x28]
+// CHECK-NEXT: add x0, x0, #48
+
+st2 { v1.8b, v2.8b }, [x14], #16
+// CHECK:      add x28, x27, w14, uxtw
+// CHECK-NEXT: st2 { v1.8b, v2.8b }, [x28]
+// CHECK-NEXT: add x14, x14, #16
+
+st2 { v1.8b, v2.8b }, [x14]
+// CHECK:      add x28, x27, w14, uxtw
+// CHECK-NEXT: st2 { v1.8b, v2.8b }, [x28]
+
+ld1 { v0.s }[1], [x8]
+// CHECK:      add x28, x27, w8, uxtw
+// CHECK-NEXT: ld1 { v0.s }[1], [x28]
+
+ld1r { v3.2d }, [x9]
+// CHECK:      add x28, x27, w9, uxtw
+// CHECK-NEXT: ld1r { v3.2d }, [x28]
+
+ld1 { v0.s }[1], [x8], x10
+// CHECK:      add x28, x27, w8, uxtw
+// CHECK-NEXT: ld1 { v0.s }[1], [x28]
+// CHECK-NEXT: add x8, x8, x10
+
+ldaxr x0, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldaxr x0, [x28]
+
+stlxr w15, w17, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stlxr w15, w17, [x28]
+
+ldr w4, [sp, w3, uxtw #2]
+// CHECK:      add x26, sp, w3, uxtw #2
+// CHECK-NEXT: ldr w4, [x27, w26, uxtw]
+
+stxrb w11, w10, [x8]
+// CHECK:      add x28, x27, w8, uxtw
+// CHECK-NEXT: stxrb w11, w10, [x28]
+
+ldr x0, [x0, :got_lo12:x]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ldr x0, [x28, :got_lo12:x]
+
+prfm pstl1strm, [x10]
+// CHECK: prfm pstl1strm, [x27, w10, uxtw]
+
+prfm pstl1strm, [x10, x11]
+// CHECK:      add x26, x10, x11
+// CHECK-NEXT: prfm pstl1strm, [x27, w26, uxtw]
+
+// Byte loads/stores
+ldrb w0, [x1]
+// CHECK: ldrb w0, [x27, w1, uxtw]
+
+ldrb w0, [x1, #1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldrb w0, [x28, #1]
+
+strb w0, [x1]
+// CHECK: strb w0, [x27, w1, uxtw]
+
+ldrsb w0, [x1]
+// CHECK: ldrsb w0, [x27, w1, uxtw]
+
+ldrsb x0, [x1]
+// CHECK: ldrsb x0, [x27, w1, uxtw]
+
+// Halfword loads/stores
+ldrh w0, [x1]
+// CHECK: ldrh w0, [x27, w1, uxtw]
+
+ldrh w0, [x1, #2]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldrh w0, [x28, #2]
+
+strh w0, [x1]
+// CHECK: strh w0, [x27, w1, uxtw]
+
+ldrsh w0, [x1]
+// CHECK: ldrsh w0, [x27, w1, uxtw]
+
+ldrsh x0, [x1]
+// CHECK: ldrsh x0, [x27, w1, uxtw]
+
+// Word loads/stores
+ldr w0, [x1]
+// CHECK: ldr w0, [x27, w1, uxtw]
+
+ldr w0, [x1, #4]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldr w0, [x28, #4]
+
+str w0, [x1]
+// CHECK: str w0, [x27, w1, uxtw]
+
+ldrsw x0, [x1]
+// CHECK: ldrsw x0, [x27, w1, uxtw]
+
+// 32-bit pairs
+ldp w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp w0, w1, [x28]
+
+stp w0, w1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp w0, w1, [x28]
+
+// Unscaled loads/stores
+ldurb w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldurb w0, [x28]
+
+ldurb w0, [x1, #1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldurb w0, [x28, #1]
+
+ldursb w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldursb w0, [x28]
+
+ldurh w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldurh w0, [x28]
+
+ldursh w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldursh w0, [x28]
+
+ldur w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldur w0, [x28]
+
+ldursw x0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: ldursw x0, [x28]
+
+sturb w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: sturb w0, [x28]
+
+sturh w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: sturh w0, [x28]
+
+stur w0, [x1]
+// CHECK:      add x28, x27, w1, uxtw
+// CHECK-NEXT: stur w0, [x28]
+
+// Byte pre/post-index
+ldrb w0, [x1, #1]!
+// CHECK:      add x1, x1, #1
+// CHECK-NEXT: ldrb w0, [x27, w1, uxtw]
+
+ldrb w0, [x1], #1
+// CHECK:      ldrb w0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #1
+
+strb w0, [x1, #1]!
+// CHECK:      add x1, x1, #1
+// CHECK-NEXT: strb w0, [x27, w1, uxtw]
+
+strb w0, [x1], #1
+// CHECK:      strb w0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #1
+
+// Halfword pre/post-index
+ldrh w0, [x1, #2]!
+// CHECK:      add x1, x1, #2
+// CHECK-NEXT: ldrh w0, [x27, w1, uxtw]
+
+ldrh w0, [x1], #2
+// CHECK:      ldrh w0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #2
+
+// Word pre/post-index
+ldr w0, [x1, #4]!
+// CHECK:      add x1, x1, #4
+// CHECK-NEXT: ldr w0, [x27, w1, uxtw]
+
+ldr w0, [x1], #4
+// CHECK:      ldr w0, [x27, w1, uxtw]
+// CHECK-NEXT: add x1, x1, #4
+
+// Register offset with different sizes
+ldrb w0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: ldrb w0, [x27, w26, uxtw]
+
+ldrh w0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: ldrh w0, [x27, w26, uxtw]
+
+ldrh w0, [x1, x2, lsl #1]
+// CHECK:      add x26, x1, x2, lsl #1
+// CHECK-NEXT: ldrh w0, [x27, w26, uxtw]
+
+ldr w0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: ldr w0, [x27, w26, uxtw]
+
+ldr w0, [x1, x2, lsl #2]
+// CHECK:      add x26, x1, x2, lsl #2
+// CHECK-NEXT: ldr w0, [x27, w26, uxtw]
+
+strb w0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: strb w0, [x27, w26, uxtw]
+
+strh w0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: strh w0, [x27, w26, uxtw]
+
+str w0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: str w0, [x27, w26, uxtw]
+
+str x0, [x1, x2]
+// CHECK:      add x26, x1, x2
+// CHECK-NEXT: str x0, [x27, w26, uxtw]
+
+// Sign/zero extension variants
+ldrb w0, [x1, w2, uxtw]
+// CHECK:      add x26, x1, w2, uxtw
+// CHECK-NEXT: ldrb w0, [x27, w26, uxtw]
+
+ldrb w0, [x1, w2, sxtw]
+// CHECK:      add x26, x1, w2, sxtw
+// CHECK-NEXT: ldrb w0, [x27, w26, uxtw]
+
+ldrh w0, [x1, w2, uxtw]
+// CHECK:      add x26, x1, w2, uxtw
+// CHECK-NEXT: ldrh w0, [x27, w26, uxtw]
+
+ldrh w0, [x1, w2, uxtw #1]
+// CHECK:      add x26, x1, w2, uxtw #1
+// CHECK-NEXT: ldrh w0, [x27, w26, uxtw]
+
+ldr w0, [x1, w2, sxtw #2]
+// CHECK:      add x26, x1, w2, sxtw #2
+// CHECK-NEXT: ldr w0, [x27, w26, uxtw]
+
+// Byte loads with #0 shift (shift amount omitted in output).
+ldrsb x0, [x1, x2, sxtx #0]
+// CHECK:      add x26, x1, x2, sxtx{{$}}
+// CHECK-NEXT: ldrsb x0, [x27, w26, uxtw]
+
+ldrsb w0, [x1, x2, sxtx #0]
+// CHECK:      add x26, x1, x2, sxtx{{$}}
+// CHECK-NEXT: ldrsb w0, [x27, w26, uxtw]
+
+ldrsb w0, [x1, w2, sxtw #0]
+// CHECK:      add x26, x1, w2, sxtw{{$}}
+// CHECK-NEXT: ldrsb w0, [x27, w26, uxtw]
+
+ldrsb x0, [x1, w2, uxtw #0]
+// CHECK:      add x26, x1, w2, uxtw{{$}}
+// CHECK-NEXT: ldrsb x0, [x27, w26, uxtw]
+
+ldrsh x0, [x1, x2, sxtx #1]
+// CHECK:      add x26, x1, x2, sxtx #1
+// CHECK-NEXT: ldrsh x0, [x27, w26, uxtw]
+
+ldrsh w0, [x1, x2, sxtx #1]
+// CHECK:      add x26, x1, x2, sxtx #1
+// CHECK-NEXT: ldrsh w0, [x27, w26, uxtw]
+
+ldrsh w0, [x1, w2, sxtw #1]
+// CHECK:      add x26, x1, w2, sxtw #1
+// CHECK-NEXT: ldrsh w0, [x27, w26, uxtw]
+
+ldrsw x0, [x1, x2, sxtx #2]
+// CHECK:      add x26, x1, x2, sxtx #2
+// CHECK-NEXT: ldrsw x0, [x27, w26, uxtw]
+
+ldrsw x0, [x1, w2, sxtw #2]
+// CHECK:      add x26, x1, w2, sxtw #2
+// CHECK-NEXT: ldrsw x0, [x27, w26, uxtw]
+
+// 32-bit pair pre/post-index
+ldp w0, w1, [x2], #8
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp w0, w1, [x28]
+// CHECK-NEXT: add x2, x2, #8
+
+ldp w0, w1, [x2, #8]!
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: ldp w0, w1, [x28, #8]
+// CHECK-NEXT: add x2, x2, #8
+
+stp w0, w1, [x2], #8
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp w0, w1, [x28]
+// CHECK-NEXT: add x2, x2, #8
+
+stp w0, w1, [x2, #8]!
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp w0, w1, [x28, #8]
+// CHECK-NEXT: add x2, x2, #8
diff --git a/llvm/test/MC/AArch64/LFI/no-lfi-loads.s b/llvm/test/MC/AArch64/LFI/no-lfi-loads.s
new file mode 100644
index 0000000000000..1a606452275f2
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/no-lfi-loads.s
@@ -0,0 +1,33 @@
+// RUN: llvm-mc -triple aarch64_lfi -mattr=+no-lfi-loads %s | FileCheck %s
+
+// Stores-only mode: loads pass through, stores are sandboxed.
+
+ldr x0, [x1]
+// CHECK: ldr x0, [x1]
+
+ldr x0, [x1, #8]
+// CHECK: ldr x0, [x1, #8]
+
+ldp x0, x1, [x2]
+// CHECK: ldp x0, x1, [x2]
+
+str x0, [x1]
+// CHECK: str x0, [x27, w1, uxtw]
+
+stp x0, x1, [x2]
+// CHECK:      add x28, x27, w2, uxtw
+// CHECK-NEXT: stp x0, x1, [x28]
+
+ldr x0, [sp, #8]
+// CHECK: ldr x0, [sp, #8]
+
+str x0, [sp, #8]
+// CHECK: str x0, [sp, #8]
+
+add sp, sp, #8
+// CHECK:      add x26, sp, #8
+// CHECK-NEXT: add sp, x27, w26, uxtw
+
+br x0
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: br x28
diff --git a/llvm/test/MC/AArch64/LFI/other.s b/llvm/test/MC/AArch64/LFI/other.s
new file mode 100644
index 0000000000000..cb3c0ca963db2
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/other.s
@@ -0,0 +1,6 @@
+// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s
+
+.lfi_rewrite_disable
+ldr x0, [x1]
+// CHECK: ldr x0, [x1]
+.lfi_rewrite_enable
diff --git a/llvm/test/MC/AArch64/LFI/pac.s b/llvm/test/MC/AArch64/LFI/pac.s
new file mode 100644
index 0000000000000..24a9327729221
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/pac.s
@@ -0,0 +1,55 @@
+// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s
+
+// Authenticated instructions are expanded to authenticate + guard + branch.
+
+.arch_extension pauth
+
+retaa
+// CHECK:      autiasp
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: ret
+
+retab
+// CHECK:      autibsp
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: ret
+
+braa x0, x1
+// CHECK:      autia x0, x1
+// CHECK-NEXT: add x28, x27, w0, uxtw
+// CHECK-NEXT: br x28
+
+braaz x2
+// CHECK:      autiza x2
+// CHECK-NEXT: add x28, x27, w2, uxtw
+// CHECK-NEXT: br x28
+
+brab x3, x4
+// CHECK:      autib x3, x4
+// CHECK-NEXT: add x28, x27, w3, uxtw
+// CHECK-NEXT: br x28
+
+brabz x5
+// CHECK:      autizb x5
+// CHECK-NEXT: add x28, x27, w5, uxtw
+// CHECK-NEXT: br x28
+
+blraa x0, x1
+// CHECK:      autia x0, x1
+// CHECK-NEXT: add x28, x27, w0, uxtw
+// CHECK-NEXT: blr x28
+
+blraaz x2
+// CHECK:      autiza x2
+// CHECK-NEXT: add x28, x27, w2, uxtw
+// CHECK-NEXT: blr x28
+
+blrab x3, x4
+// CHECK:      autib x3, x4
+// CHECK-NEXT: add x28, x27, w3, uxtw
+// CHECK-NEXT: blr x28
+
+blrabz x5
+// CHECK:      autizb x5
+// CHECK-NEXT: add x28, x27, w5, uxtw
+// CHECK-NEXT: blr x28
diff --git a/llvm/test/MC/AArch64/LFI/prefetch.s b/llvm/test/MC/AArch64/LFI/prefetch.s
new file mode 100644
index 0000000000000..cd0ec708c3293
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/prefetch.s
@@ -0,0 +1,81 @@
+// RUN: llvm-mc -triple aarch64_lfi --aarch64-lfi-no-guard-elim %s | FileCheck %s
+
+prfm pldl1keep, [x0]
+// CHECK: prfm pldl1keep, [x27, w0, uxtw]
+
+prfm pldl1strm, [x0]
+// CHECK: prfm pldl1strm, [x27, w0, uxtw]
+
+prfm pldl2keep, [x0]
+// CHECK: prfm pldl2keep, [x27, w0, uxtw]
+
+prfm pldl2strm, [x0]
+// CHECK: prfm pldl2strm, [x27, w0, uxtw]
+
+prfm pldl3keep, [x0]
+// CHECK: prfm pldl3keep, [x27, w0, uxtw]
+
+prfm pldl3strm, [x0]
+// CHECK: prfm pldl3strm, [x27, w0, uxtw]
+
+prfm pstl1keep, [x0]
+// CHECK: prfm pstl1keep, [x27, w0, uxtw]
+
+prfm pstl1strm, [x0]
+// CHECK: prfm pstl1strm, [x27, w0, uxtw]
+
+prfm pstl2keep, [x0]
+// CHECK: prfm pstl2keep, [x27, w0, uxtw]
+
+prfm pstl2strm, [x0]
+// CHECK: prfm pstl2strm, [x27, w0, uxtw]
+
+prfm pldl1keep, [x0, #8]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: prfm pldl1keep, [x28, #8]
+
+prfm pstl1strm, [x0, #16]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: prfm pstl1strm, [x28, #16]
+
+prfm pldl1keep, [x0, x1]
+// CHECK:      add x26, x0, x1
+// CHECK-NEXT: prfm pldl1keep, [x27, w26, uxtw]
+
+prfm pldl1keep, [x0, x1, lsl #3]
+// CHECK:      add x26, x0, x1, lsl #3
+// CHECK-NEXT: prfm pldl1keep, [x27, w26, uxtw]
+
+prfm pldl1keep, [x0, w1, uxtw]
+// CHECK:      add x26, x0, w1, uxtw
+// CHECK-NEXT: prfm pldl1keep, [x27, w26, uxtw]
+
+prfm pldl1keep, [x0, w1, sxtw]
+// CHECK:      add x26, x0, w1, sxtw
+// CHECK-NEXT: prfm pldl1keep, [x27, w26, uxtw]
+
+prfm pldl1keep, [x0, w1, uxtw #3]
+// CHECK:      add x26, x0, w1, uxtw #3
+// CHECK-NEXT: prfm pldl1keep, [x27, w26, uxtw]
+
+prfm pldl1keep, [x0, w1, sxtw #3]
+// CHECK:      add x26, x0, w1, sxtw #3
+// CHECK-NEXT: prfm pldl1keep, [x27, w26, uxtw]
+
+prfum pldl1keep, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: prfum pldl1keep, [x28]
+
+prfum pldl1keep, [x0, #1]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: prfum pldl1keep, [x28, #1]
+
+prfum pstl1strm, [x0, #-8]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: prfum pstl1strm, [x28, #-8]
+
+prfm pldl1keep, [sp]
+// CHECK: prfm pldl1keep, [sp]
+
+prfm pldl1keep, [sp, #8]
+// CHECK: prfm pldl1keep, [sp, #8]
diff --git a/llvm/test/MC/AArch64/LFI/rcpc.s b/llvm/test/MC/AArch64/LFI/rcpc.s
new file mode 100644
index 0000000000000..ad89517f1e8a2
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/rcpc.s
@@ -0,0 +1,19 @@
+// RUN: llvm-mc -triple aarch64_lfi --aarch64-lfi-no-guard-elim %s | FileCheck %s
+
+.arch_extension rcpc
+
+ldapr x0, [x8]
+// CHECK:      add x28, x27, w8, uxtw
+// CHECK-NEXT: ldapr x0, [x28]
+
+ldapr w0, [x8]
+// CHECK:      add x28, x27, w8, uxtw
+// CHECK-NEXT: ldapr w0, [x28]
+
+ldaprh w0, [x8]
+// CHECK:      add x28, x27, w8, uxtw
+// CHECK-NEXT: ldaprh w0, [x28]
+
+ldaprb w0, [x8]
+// CHECK:      add x28, x27, w8, uxtw
+// CHECK-NEXT: ldaprb w0, [x28]
diff --git a/llvm/test/MC/AArch64/LFI/reserved.s b/llvm/test/MC/AArch64/LFI/reserved.s
new file mode 100644
index 0000000000000..8ad5e7c56bb9e
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/reserved.s
@@ -0,0 +1,45 @@
+// RUN: not llvm-mc -triple aarch64_lfi %s 2>&1 | FileCheck %s
+
+mov x27, x0
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        mov x27, x0
+
+ldr x27, [x0]
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        ldr x27, [x0]
+
+add x27, x0, x1
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        add x27, x0, x1
+
+mov x28, x0
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        mov x28, x0
+
+ldr x28, [x0]
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        ldr x28, [x0]
+
+add x28, x0, x1
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        add x28, x0, x1
+
+ldp x27, x28, [x0]
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        ldp x27, x28, [x0]
+
+ldp x0, x27, [x1]
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        ldp x0, x27, [x1]
+
+ldp x28, x0, [x1]
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        ldp x28, x0, [x1]
+
+ldr x0, [x27], #8
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        ldr x0, [x27], #8
+
+ldr x0, [x28, #8]!
+// CHECK: error: illegal modification of reserved LFI register
+// CHECK:        ldr x0, [x28, #8]!
diff --git a/llvm/test/MC/AArch64/LFI/return.s b/llvm/test/MC/AArch64/LFI/return.s
new file mode 100644
index 0000000000000..d41266f9579d0
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/return.s
@@ -0,0 +1,72 @@
+// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s
+
+// LR guard is deferred until the next control flow instruction for PAC
+// compatibility.
+
+.arch_extension pauth
+
+mov x30, x0
+ret
+// CHECK:      mov x30, x0
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: ret
+
+ldr x30, [sp]
+ret
+// CHECK:      ldr x30, [sp]
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: ret
+
+ldp x29, x30, [sp]
+ret
+// CHECK:      ldp x29, x30, [sp]
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: ret
+
+// Deferred guard flushed before a label.
+mov x30, x0
+next_func:
+nop
+// CHECK:      mov x30, x0
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK:      nop
+
+// AUTIASP strips PAC, so a deferred guard is set.
+autiasp
+ret
+// CHECK:      autiasp
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: ret
+
+// PACIASP just signs LR and doesn't need a guard.
+paciasp
+nop
+// CHECK:      paciasp
+// CHECK-NEXT: nop
+
+// Deferred guard flushed before bl.
+mov x30, x0
+bl some_func
+// CHECK:      mov x30, x0
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: bl some_func
+
+// Deferred guard flushed before blr.
+mov x30, x0
+blr x1
+// CHECK:      mov x30, x0
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: add x28, x27, w1, uxtw
+// CHECK-NEXT: blr x28
+
+// Deferred guard flushed before branch.
+mov x30, x0
+b some_func
+// CHECK:      mov x30, x0
+// CHECK-NEXT: add x30, x27, w30, uxtw
+// CHECK-NEXT: b some_func
+
+// Deferred guard flushed at end of stream.
+mov x30, x0
+// CHECK:      mov x30, x0
+// CHECK-NEXT: add x30, x27, w30, uxtw
diff --git a/llvm/test/MC/AArch64/LFI/simd.s b/llvm/test/MC/AArch64/LFI/simd.s
new file mode 100644
index 0000000000000..7da1fa5e3fd81
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/simd.s
@@ -0,0 +1,472 @@
+// RUN: llvm-mc -triple aarch64_lfi --aarch64-lfi-no-guard-elim %s | FileCheck %s
+// LD1/ST1 single structure (no post-index)
+ld1 { v0.b }[0], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.b }[0], [x28]
+ld1 { v0.h }[1], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.h }[1], [x28]
+ld1 { v0.s }[2], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.s }[2], [x28]
+ld1 { v0.d }[1], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.d }[1], [x28]
+st1 { v0.b }[0], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.b }[0], [x28]
+st1 { v0.h }[1], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.h }[1], [x28]
+st1 { v0.s }[2], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.s }[2], [x28]
+st1 { v0.d }[1], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.d }[1], [x28]
+// LD1/ST1 single structure with post-index (natural offset)
+ld1 { v0.b }[0], [x0], #1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, #1
+ld1 { v0.h }[1], [x0], #2
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.h }[1], [x28]
+// CHECK-NEXT: add x0, x0, #2
+ld1 { v0.s }[2], [x0], #4
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.s }[2], [x28]
+// CHECK-NEXT: add x0, x0, #4
+ld1 { v0.d }[1], [x0], #8
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.d }[1], [x28]
+// CHECK-NEXT: add x0, x0, #8
+st1 { v0.b }[0], [x0], #1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, #1
+st1 { v0.h }[1], [x0], #2
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.h }[1], [x28]
+// CHECK-NEXT: add x0, x0, #2
+st1 { v0.s }[2], [x0], #4
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.s }[2], [x28]
+// CHECK-NEXT: add x0, x0, #4
+st1 { v0.d }[1], [x0], #8
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.d }[1], [x28]
+// CHECK-NEXT: add x0, x0, #8
+// LD1/ST1 single structure with post-index (register offset)
+ld1 { v0.b }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+ld1 { v0.s }[2], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.s }[2], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st1 { v0.d }[1], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.d }[1], [x28]
+// CHECK-NEXT: add x0, x0, x1
+// LD1R (replicate single element to all lanes)
+ld1r { v0.8b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.8b }, [x28]
+ld1r { v0.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.16b }, [x28]
+ld1r { v0.4h }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.4h }, [x28]
+ld1r { v0.8h }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.8h }, [x28]
+ld1r { v0.2s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.2s }, [x28]
+ld1r { v0.4s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.4s }, [x28]
+ld1r { v0.1d }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.1d }, [x28]
+ld1r { v0.2d }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.2d }, [x28]
+// LD1R with post-index
+ld1r { v0.8b }, [x0], #1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.8b }, [x28]
+// CHECK-NEXT: add x0, x0, #1
+ld1r { v0.4h }, [x0], #2
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.4h }, [x28]
+// CHECK-NEXT: add x0, x0, #2
+ld1r { v0.2s }, [x0], #4
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.2s }, [x28]
+// CHECK-NEXT: add x0, x0, #4
+ld1r { v0.1d }, [x0], #8
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.1d }, [x28]
+// CHECK-NEXT: add x0, x0, #8
+ld1r { v0.2d }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1r { v0.2d }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+// LD1/ST1 multiple structures (1-4 registers)
+ld1 { v0.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b }, [x28]
+ld1 { v0.16b, v1.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b, v1.16b }, [x28]
+ld1 { v0.16b, v1.16b, v2.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b, v1.16b, v2.16b }, [x28]
+ld1 { v0.16b, v1.16b, v2.16b, v3.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b, v1.16b, v2.16b, v3.16b }, [x28]
+st1 { v0.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.16b }, [x28]
+st1 { v0.16b, v1.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.16b, v1.16b }, [x28]
+st1 { v0.16b, v1.16b, v2.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.16b, v1.16b, v2.16b }, [x28]
+st1 { v0.16b, v1.16b, v2.16b, v3.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.16b, v1.16b, v2.16b, v3.16b }, [x28]
+// LD1/ST1 multiple structures with post-index
+ld1 { v0.16b }, [x0], #16
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b }, [x28]
+// CHECK-NEXT: add x0, x0, #16
+ld1 { v0.16b, v1.16b }, [x0], #32
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b, v1.16b }, [x28]
+// CHECK-NEXT: add x0, x0, #32
+ld1 { v0.16b, v1.16b, v2.16b }, [x0], #48
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b, v1.16b, v2.16b }, [x28]
+// CHECK-NEXT: add x0, x0, #48
+ld1 { v0.16b, v1.16b, v2.16b, v3.16b }, [x0], #64
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b, v1.16b, v2.16b, v3.16b }, [x28]
+// CHECK-NEXT: add x0, x0, #64
+ld1 { v0.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld1 { v0.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+// LD2/ST2 multiple structures
+ld2 { v0.8b, v1.8b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.8b, v1.8b }, [x28]
+ld2 { v0.16b, v1.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.16b, v1.16b }, [x28]
+ld2 { v0.4h, v1.4h }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.4h, v1.4h }, [x28]
+ld2 { v0.8h, v1.8h }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.8h, v1.8h }, [x28]
+ld2 { v0.2s, v1.2s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.2s, v1.2s }, [x28]
+ld2 { v0.4s, v1.4s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.4s, v1.4s }, [x28]
+ld2 { v0.2d, v1.2d }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.2d, v1.2d }, [x28]
+st2 { v0.16b, v1.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.16b, v1.16b }, [x28]
+ld2 { v0.16b, v1.16b }, [x0], #32
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.16b, v1.16b }, [x28]
+// CHECK-NEXT: add x0, x0, #32
+st2 { v0.16b, v1.16b }, [x0], #32
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.16b, v1.16b }, [x28]
+// CHECK-NEXT: add x0, x0, #32
+// LD3/ST3 multiple structures
+ld3 { v0.8b, v1.8b, v2.8b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.8b, v1.8b, v2.8b }, [x28]
+ld3 { v0.16b, v1.16b, v2.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.16b, v1.16b, v2.16b }, [x28]
+ld3 { v0.4s, v1.4s, v2.4s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.4s, v1.4s, v2.4s }, [x28]
+st3 { v0.4s, v1.4s, v2.4s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.4s, v1.4s, v2.4s }, [x28]
+ld3 { v0.4s, v1.4s, v2.4s }, [x0], #48
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.4s, v1.4s, v2.4s }, [x28]
+// CHECK-NEXT: add x0, x0, #48
+ld3 { v0.4s, v1.4s, v2.4s }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.4s, v1.4s, v2.4s }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+// LD4/ST4 multiple structures
+ld4 { v0.8b, v1.8b, v2.8b, v3.8b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.8b, v1.8b, v2.8b, v3.8b }, [x28]
+ld4 { v0.16b, v1.16b, v2.16b, v3.16b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.16b, v1.16b, v2.16b, v3.16b }, [x28]
+ld4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x28]
+ld4 { v0.2d, v1.2d, v2.2d, v3.2d }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.2d, v1.2d, v2.2d, v3.2d }, [x28]
+st4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x28]
+ld4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x0], #64
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x28]
+// CHECK-NEXT: add x0, x0, #64
+ld4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+// LD2R/LD3R/LD4R (replicate)
+ld2r { v0.8b, v1.8b }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2r { v0.8b, v1.8b }, [x28]
+ld3r { v0.4s, v1.4s, v2.4s }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3r { v0.4s, v1.4s, v2.4s }, [x28]
+ld4r { v0.2d, v1.2d, v2.2d, v3.2d }, [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4r { v0.2d, v1.2d, v2.2d, v3.2d }, [x28]
+ld2r { v0.8b, v1.8b }, [x0], #2
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2r { v0.8b, v1.8b }, [x28]
+// CHECK-NEXT: add x0, x0, #2
+ld3r { v0.4s, v1.4s, v2.4s }, [x0], #12
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3r { v0.4s, v1.4s, v2.4s }, [x28]
+// CHECK-NEXT: add x0, x0, #12
+ld4r { v0.2d, v1.2d, v2.2d, v3.2d }, [x0], #32
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4r { v0.2d, v1.2d, v2.2d, v3.2d }, [x28]
+// CHECK-NEXT: add x0, x0, #32
+// LD2/LD3/LD4 single structure (lane loads)
+ld2 { v0.b, v1.b }[0], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.b, v1.b }[0], [x28]
+ld3 { v0.s, v1.s, v2.s }[1], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.s, v1.s, v2.s }[1], [x28]
+ld4 { v0.d, v1.d, v2.d, v3.d }[0], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.d, v1.d, v2.d, v3.d }[0], [x28]
+st2 { v0.h, v1.h }[3], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.h, v1.h }[3], [x28]
+st3 { v0.s, v1.s, v2.s }[2], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.s, v1.s, v2.s }[2], [x28]
+st4 { v0.d, v1.d, v2.d, v3.d }[1], [x0]
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.d, v1.d, v2.d, v3.d }[1], [x28]
+
+ld2 { v0.b, v1.b }[0], [x0], #2
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.b, v1.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, #2
+ld3 { v0.s, v1.s, v2.s }[1], [x0], #12
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.s, v1.s, v2.s }[1], [x28]
+// CHECK-NEXT: add x0, x0, #12
+ld4 { v0.d, v1.d, v2.d, v3.d }[0], [x0], #32
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.d, v1.d, v2.d, v3.d }[0], [x28]
+// CHECK-NEXT: add x0, x0, #32
+ld2 { v0.s, v1.s }[1], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.s, v1.s }[1], [x28]
+// CHECK-NEXT: add x0, x0, x1
+
+// ST2/ST3/ST4 lane stores with immediate post-index
+st2 { v0.b, v1.b }[0], [x0], #2
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.b, v1.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, #2
+st2 { v0.h, v1.h }[1], [x0], #4
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.h, v1.h }[1], [x28]
+// CHECK-NEXT: add x0, x0, #4
+st2 { v0.s, v1.s }[2], [x0], #8
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.s, v1.s }[2], [x28]
+// CHECK-NEXT: add x0, x0, #8
+st2 { v0.d, v1.d }[1], [x0], #16
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.d, v1.d }[1], [x28]
+// CHECK-NEXT: add x0, x0, #16
+st3 { v0.b, v1.b, v2.b }[0], [x0], #3
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.b, v1.b, v2.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, #3
+st3 { v0.h, v1.h, v2.h }[1], [x0], #6
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.h, v1.h, v2.h }[1], [x28]
+// CHECK-NEXT: add x0, x0, #6
+st3 { v0.s, v1.s, v2.s }[2], [x0], #12
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.s, v1.s, v2.s }[2], [x28]
+// CHECK-NEXT: add x0, x0, #12
+st3 { v0.d, v1.d, v2.d }[0], [x0], #24
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.d, v1.d, v2.d }[0], [x28]
+// CHECK-NEXT: add x0, x0, #24
+st4 { v0.b, v1.b, v2.b, v3.b }[0], [x0], #4
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.b, v1.b, v2.b, v3.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, #4
+st4 { v0.h, v1.h, v2.h, v3.h }[1], [x0], #8
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.h, v1.h, v2.h, v3.h }[1], [x28]
+// CHECK-NEXT: add x0, x0, #8
+st4 { v0.s, v1.s, v2.s, v3.s }[2], [x0], #16
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.s, v1.s, v2.s, v3.s }[2], [x28]
+// CHECK-NEXT: add x0, x0, #16
+st4 { v0.d, v1.d, v2.d, v3.d }[0], [x0], #32
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.d, v1.d, v2.d, v3.d }[0], [x28]
+// CHECK-NEXT: add x0, x0, #32
+
+// ST1/ST2/ST3/ST4 lane stores with register post-index
+st1 { v0.b }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st1 { v0.h }[1], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.h }[1], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st1 { v0.s }[2], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.s }[2], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st2 { v0.b, v1.b }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.b, v1.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st2 { v0.s, v1.s }[1], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.s, v1.s }[1], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st2 { v0.d, v1.d }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.d, v1.d }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st3 { v0.b, v1.b, v2.b }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.b, v1.b, v2.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st3 { v0.s, v1.s, v2.s }[1], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.s, v1.s, v2.s }[1], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st3 { v0.d, v1.d, v2.d }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.d, v1.d, v2.d }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st4 { v0.b, v1.b, v2.b, v3.b }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.b, v1.b, v2.b, v3.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st4 { v0.s, v1.s, v2.s, v3.s }[1], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.s, v1.s, v2.s, v3.s }[1], [x28]
+// CHECK-NEXT: add x0, x0, x1
+st4 { v0.d, v1.d, v2.d, v3.d }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.d, v1.d, v2.d, v3.d }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+
+ld3 { v0.b, v1.b, v2.b }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.b, v1.b, v2.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+ld3 { v0.d, v1.d, v2.d }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld3 { v0.d, v1.d, v2.d }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+ld4 { v0.b, v1.b, v2.b, v3.b }[0], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.b, v1.b, v2.b, v3.b }[0], [x28]
+// CHECK-NEXT: add x0, x0, x1
+ld4 { v0.s, v1.s, v2.s, v3.s }[1], [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld4 { v0.s, v1.s, v2.s, v3.s }[1], [x28]
+// CHECK-NEXT: add x0, x0, x1
+
+// ST1/ST2/ST3/ST4 multi-register stores with register post-index
+st1 { v0.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st1 { v0.16b, v1.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.16b, v1.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st1 { v0.16b, v1.16b, v2.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.16b, v1.16b, v2.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st1 { v0.16b, v1.16b, v2.16b, v3.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st1 { v0.16b, v1.16b, v2.16b, v3.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st2 { v0.16b, v1.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.16b, v1.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st2 { v0.4s, v1.4s }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st2 { v0.4s, v1.4s }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st3 { v0.16b, v1.16b, v2.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.16b, v1.16b, v2.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st3 { v0.4s, v1.4s, v2.4s }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st3 { v0.4s, v1.4s, v2.4s }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st4 { v0.16b, v1.16b, v2.16b, v3.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.16b, v1.16b, v2.16b, v3.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.4s, v1.4s, v2.4s, v3.4s }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+st4 { v0.2d, v1.2d, v2.2d, v3.2d }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: st4 { v0.2d, v1.2d, v2.2d, v3.2d }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+
+ld2 { v0.16b, v1.16b }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.16b, v1.16b }, [x28]
+// CHECK-NEXT: add x0, x0, x1
+ld2 { v0.4s, v1.4s }, [x0], x1
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: ld2 { v0.4s, v1.4s }, [x28]
+// CHECK-NEXT: add x0, x0, x1
diff --git a/llvm/test/MC/AArch64/LFI/stack.s b/llvm/test/MC/AArch64/LFI/stack.s
new file mode 100644
index 0000000000000..41e389401a428
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/stack.s
@@ -0,0 +1,37 @@
+// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s
+
+ldr x0, [sp, #16]!
+// CHECK: ldr x0, [sp, #16]!
+
+ldr x0, [sp], #16
+// CHECK: ldr x0, [sp], #16
+
+str x0, [sp, #16]!
+// CHECK: str x0, [sp, #16]!
+
+str x0, [sp], #16
+// CHECK: str x0, [sp], #16
+
+mov sp, x0
+// CHECK:      add x26, x0, #0
+// CHECK-NEXT: add sp, x27, w26, uxtw
+
+add sp, sp, #8
+// CHECK:      add x26, sp, #8
+// CHECK-NEXT: add sp, x27, w26, uxtw
+
+sub sp, sp, #8
+// CHECK:      sub x26, sp, #8
+// CHECK-NEXT: add sp, x27, w26, uxtw
+
+add sp, sp, x0
+// CHECK:      add x26, sp, x0
+// CHECK-NEXT: add sp, x27, w26, uxtw
+
+sub sp, sp, x0
+// CHECK:      sub x26, sp, x0
+// CHECK-NEXT: add sp, x27, w26, uxtw
+
+sub sp, sp, #1, lsl #12
+// CHECK:      sub x26, sp, #1, lsl #12
+// CHECK-NEXT: add sp, x27, w26, uxtw
diff --git a/llvm/test/MC/AArch64/LFI/sys.s b/llvm/test/MC/AArch64/LFI/sys.s
new file mode 100644
index 0000000000000..9563cb62c9763
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/sys.s
@@ -0,0 +1,15 @@
+// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s
+
+svc #0
+// CHECK:      mov x26, x30
+// CHECK-NEXT: ldur x30, [x27, #-8]
+// CHECK-NEXT: blr x30
+// CHECK-NEXT: add x30, x27, w26, uxtw
+
+dc zva, x0
+// CHECK:      add x28, x27, w0, uxtw
+// CHECK-NEXT: dc zva, x28
+
+dc zva, x5
+// CHECK:      add x28, x27, w5, uxtw
+// CHECK-NEXT: dc zva, x28
diff --git a/llvm/test/MC/AArch64/LFI/tls-reg.s b/llvm/test/MC/AArch64/LFI/tls-reg.s
new file mode 100644
index 0000000000000..5a536b5cf5443
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/tls-reg.s
@@ -0,0 +1,13 @@
+// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s
+
+mrs x0, tpidr_el0
+// CHECK: ldr x0, [x25, #32]
+
+mrs x1, tpidr_el0
+// CHECK: ldr x1, [x25, #32]
+
+msr tpidr_el0, x0
+// CHECK: str x0, [x25, #32]
+
+msr tpidr_el0, x1
+// CHECK: str x1, [x25, #32]
diff --git a/llvm/test/MC/AArch64/LFI/unsupported/literal.s b/llvm/test/MC/AArch64/LFI/unsupported/literal.s
new file mode 100644
index 0000000000000..8e09b61d022c9
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/unsupported/literal.s
@@ -0,0 +1,26 @@
+// RUN: not llvm-mc -triple aarch64_lfi %s 2>&1 | FileCheck %s
+
+ldr x0, label
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldr w0, label
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldr s0, label
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldr d0, label
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldr q0, label
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+ldrsw x0, label
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+prfm pldl1keep, label
+// CHECK: error: PC-relative literal loads are not supported in LFI
+
+label:
+.word 0x12345678
+
diff --git a/llvm/test/MC/AArch64/LFI/unsupported/pac.s b/llvm/test/MC/AArch64/LFI/unsupported/pac.s
new file mode 100644
index 0000000000000..936b08396f2fd
--- /dev/null
+++ b/llvm/test/MC/AArch64/LFI/unsupported/pac.s
@@ -0,0 +1,13 @@
+// RUN: not llvm-mc -triple aarch64_lfi %s 2>&1 | FileCheck %s
+
+.arch_extension pauth
+
+eret
+// CHECK: error: exception returns (ERET/ERETAA/ERETAB) are not supported by LFI
+
+eretaa
+// CHECK: error: exception returns (ERET/ERETAA/ERETAB) are not supported by LFI
+
+eretab
+// CHECK: error: exception returns (ERET/ERETAA/ERETAB) are not supported by LFI
+
diff --git a/llvm/unittests/Target/AArch64/CMakeLists.txt b/llvm/unittests/Target/AArch64/CMakeLists.txt
index 3875163772575..5826383f1f1f9 100644
--- a/llvm/unittests/Target/AArch64/CMakeLists.txt
+++ b/llvm/unittests/Target/AArch64/CMakeLists.txt
@@ -29,6 +29,7 @@ add_llvm_target_unittest(AArch64Tests
   DecomposeStackOffsetTest.cpp
   InstSizes.cpp
   MatrixRegisterAliasing.cpp
+  MemOpAddrModeTest.cpp
   SMEAttributesTest.cpp
   AArch64RegisterInfoTest.cpp
   AArch64SVESchedPseudoTest.cpp
diff --git a/llvm/unittests/Target/AArch64/MemOpAddrModeTest.cpp b/llvm/unittests/Target/AArch64/MemOpAddrModeTest.cpp
new file mode 100644
index 0000000000000..755fcfcf35c2a
--- /dev/null
+++ b/llvm/unittests/Target/AArch64/MemOpAddrModeTest.cpp
@@ -0,0 +1,158 @@
+//===- MemOpAddrModeTest.cpp - Test memory operand addressing modes -------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "AArch64InstrInfo.h"
+#include "AArch64Subtarget.h"
+#include "AArch64TargetMachine.h"
+#include "llvm/MC/TargetRegistry.h"
+#include "llvm/Support/TargetSelect.h"
+#include "llvm/Target/TargetMachine.h"
+#include "llvm/Target/TargetOptions.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+
+namespace {
+
+class MemOpAddrModeTest : public testing::Test {
+protected:
+  static void SetUpTestSuite() {
+    LLVMInitializeAArch64TargetInfo();
+    LLVMInitializeAArch64Target();
+    LLVMInitializeAArch64TargetMC();
+  }
+
+  void SetUp() override {
+    Triple TT("aarch64-unknown-linux-gnu");
+    std::string Error;
+    const Target *T = TargetRegistry::lookupTarget(TT, Error);
+    if (!T)
+      GTEST_SKIP() << Error;
+
+    TargetOptions Options;
+    TM.reset(T->createTargetMachine(TT, "generic", "+lse", Options,
+                                    std::nullopt, std::nullopt,
+                                    CodeGenOptLevel::Default));
+    MII = TM->getMCInstrInfo();
+  }
+
+  std::unique_ptr<TargetMachine> TM;
+  const MCInstrInfo *MII = nullptr;
+};
+
+TEST_F(MemOpAddrModeTest, IndexedLoads) {
+  // LDRXui should have Indexed mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDRXui);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModeIndexed);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), 1);
+  EXPECT_EQ(AArch64::getMemOpOffsetIdx(Desc.TSFlags), 2);
+  EXPECT_FALSE(AArch64::isMemOpPrePostIdx(Desc.TSFlags));
+}
+
+TEST_F(MemOpAddrModeTest, UnscaledLoads) {
+  // LDURXi should have Unscaled mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDURXi);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModeUnscaled);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), 1);
+  EXPECT_FALSE(AArch64::isMemOpPrePostIdx(Desc.TSFlags));
+}
+
+TEST_F(MemOpAddrModeTest, RegisterOffsetLoads) {
+  // LDRXroX should have RegOff mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDRXroX);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModeRegOff);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), 1);
+}
+
+TEST_F(MemOpAddrModeTest, PreIndexLoads) {
+  // LDRXpre should have PreIdx mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDRXpre);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModePreIdx);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), 2);
+  EXPECT_TRUE(AArch64::isMemOpPrePostIdx(Desc.TSFlags));
+}
+
+TEST_F(MemOpAddrModeTest, PostIndexLoads) {
+  // LDRXpost should have PostIdx mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDRXpost);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModePostIdx);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), 2);
+  EXPECT_TRUE(AArch64::isMemOpPrePostIdx(Desc.TSFlags));
+}
+
+TEST_F(MemOpAddrModeTest, LiteralLoads) {
+  // LDRXl should have Literal mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDRXl);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModeLiteral);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), -1);
+}
+
+TEST_F(MemOpAddrModeTest, PairLoads) {
+  // LDPXi should have Pair mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDPXi);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModePair);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), 2);
+  EXPECT_FALSE(AArch64::isMemOpPrePostIdx(Desc.TSFlags));
+}
+
+TEST_F(MemOpAddrModeTest, PairPreIndexLoads) {
+  // LDPXpre should have PairPre mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDPXpre);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModePairPre);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), 3);
+  EXPECT_TRUE(AArch64::isMemOpPrePostIdx(Desc.TSFlags));
+}
+
+TEST_F(MemOpAddrModeTest, SIMDLoadNoIndex) {
+  // LD1Twov8b should have NoIdx mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LD1Twov8b);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModeNoIdx);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), 1);
+  EXPECT_EQ(AArch64::getMemOpOffsetIdx(Desc.TSFlags), -1);
+}
+
+TEST_F(MemOpAddrModeTest, SIMDLoadPostIndexReg) {
+  // LD1Twov8b_POST should have PostIdxReg mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LD1Twov8b_POST);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModePostIdxReg);
+  EXPECT_TRUE(AArch64::isMemOpPrePostIdx(Desc.TSFlags));
+}
+
+TEST_F(MemOpAddrModeTest, AtomicLoads) {
+  // LDADDX should have NoIdx mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDADDX);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModeNoIdx);
+}
+
+TEST_F(MemOpAddrModeTest, ExclusiveLoads) {
+  // LDXRX should have NoIdx mode
+  const MCInstrDesc &Desc = MII->get(AArch64::LDXRX);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModeNoIdx);
+}
+
+TEST_F(MemOpAddrModeTest, NonMemoryInstructions) {
+  // ADDXri should have None mode
+  const MCInstrDesc &Desc = MII->get(AArch64::ADDXri);
+  uint64_t Mode = Desc.TSFlags & AArch64::MemOpAddrModeMask;
+  EXPECT_EQ(Mode, AArch64::MemOpAddrModeNone);
+  EXPECT_EQ(AArch64::getMemOpBaseRegIdx(Desc.TSFlags), -1);
+}
+
+} // namespace

>From eb92ac4e2b704d5b9b71f4e5fd076e9e576e9e58 Mon Sep 17 00:00:00 2001
From: Zachary Yedidia <zyedidia at gmail.com>
Date: Tue, 3 Mar 2026 04:27:39 -0500
Subject: [PATCH 2/2] Apply clang-format fixes

---
 .../Target/AArch64/Utils/AArch64BaseInfo.h    | 32 +++++++++----------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h b/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
index 2a5415365e86b..b2ccaf3c31f76 100644
--- a/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+++ b/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
@@ -1061,28 +1061,28 @@ static constexpr unsigned SVEMaxBitsPerVector = 2048;
 
 // TSFlags layout for memory operation fields (bits 14-26).
 // See AArch64InstrInfo.h for the full TSFlags layout.
-#define TSFLAG_MEM_OP_ADDR_MODE(X)      ((X) << 14) // 5-bits
-#define TSFLAG_MEM_OP_BASE_IDX(X)       ((X) << 19) // 4-bits
-#define TSFLAG_MEM_OP_OFFSET_IDX(X)     ((X) << 23) // 4-bits
+#define TSFLAG_MEM_OP_ADDR_MODE(X) ((X) << 14)  // 5-bits
+#define TSFLAG_MEM_OP_BASE_IDX(X) ((X) << 19)   // 4-bits
+#define TSFLAG_MEM_OP_OFFSET_IDX(X) ((X) << 23) // 4-bits
 
 namespace AArch64 {
 
 /// Memory operation addressing mode classification for load/store instructions.
 /// Used to identify operand layout for memory operations.
 enum MemOpAddrModeType {
-  MemOpAddrModeMask       = TSFLAG_MEM_OP_ADDR_MODE(0x1f),
-  MemOpAddrModeNone       = TSFLAG_MEM_OP_ADDR_MODE(0x0),  // Not a memory op
-  MemOpAddrModeIndexed    = TSFLAG_MEM_OP_ADDR_MODE(0x1),  // [Xn, #imm]
-  MemOpAddrModeUnscaled   = TSFLAG_MEM_OP_ADDR_MODE(0x2),  // [Xn, #simm]
-  MemOpAddrModePreIdx     = TSFLAG_MEM_OP_ADDR_MODE(0x3),  // [Xn, #imm]!
-  MemOpAddrModePostIdx    = TSFLAG_MEM_OP_ADDR_MODE(0x4),  // [Xn], #imm
-  MemOpAddrModeRegOff     = TSFLAG_MEM_OP_ADDR_MODE(0x5),  // [Xn, Xm, ext]
-  MemOpAddrModeLiteral    = TSFLAG_MEM_OP_ADDR_MODE(0x6),  // PC-relative
-  MemOpAddrModeNoIdx      = TSFLAG_MEM_OP_ADDR_MODE(0x7),  // [Xn] (no offset)
-  MemOpAddrModePair       = TSFLAG_MEM_OP_ADDR_MODE(0x8),  // LDP/STP [Xn, #imm]
-  MemOpAddrModePairPre    = TSFLAG_MEM_OP_ADDR_MODE(0x9),  // LDP/STP [Xn, #imm]!
-  MemOpAddrModePairPost   = TSFLAG_MEM_OP_ADDR_MODE(0xa),  // LDP/STP [Xn], #imm
-  MemOpAddrModePostIdxReg = TSFLAG_MEM_OP_ADDR_MODE(0xb),  // [Xn], Xm (SIMD)
+  MemOpAddrModeMask = TSFLAG_MEM_OP_ADDR_MODE(0x1f),
+  MemOpAddrModeNone = TSFLAG_MEM_OP_ADDR_MODE(0x0),       // Not a memory op
+  MemOpAddrModeIndexed = TSFLAG_MEM_OP_ADDR_MODE(0x1),    // [Xn, #imm]
+  MemOpAddrModeUnscaled = TSFLAG_MEM_OP_ADDR_MODE(0x2),   // [Xn, #simm]
+  MemOpAddrModePreIdx = TSFLAG_MEM_OP_ADDR_MODE(0x3),     // [Xn, #imm]!
+  MemOpAddrModePostIdx = TSFLAG_MEM_OP_ADDR_MODE(0x4),    // [Xn], #imm
+  MemOpAddrModeRegOff = TSFLAG_MEM_OP_ADDR_MODE(0x5),     // [Xn, Xm, ext]
+  MemOpAddrModeLiteral = TSFLAG_MEM_OP_ADDR_MODE(0x6),    // PC-relative
+  MemOpAddrModeNoIdx = TSFLAG_MEM_OP_ADDR_MODE(0x7),      // [Xn] (no offset)
+  MemOpAddrModePair = TSFLAG_MEM_OP_ADDR_MODE(0x8),       // LDP/STP [Xn, #imm]
+  MemOpAddrModePairPre = TSFLAG_MEM_OP_ADDR_MODE(0x9),    // LDP/STP [Xn, #imm]!
+  MemOpAddrModePairPost = TSFLAG_MEM_OP_ADDR_MODE(0xa),   // LDP/STP [Xn], #imm
+  MemOpAddrModePostIdxReg = TSFLAG_MEM_OP_ADDR_MODE(0xb), // [Xn], Xm (SIMD)
 };
 
 /// Mask and shift for extracting the base register operand index.



More information about the cfe-commits mailing list