[Lldb-commits] [lldb] [lldb][AArch64] Add SME2's ZT0 register (PR #70205)

David Spickett via lldb-commits lldb-commits at lists.llvm.org
Thu Oct 26 01:34:10 PDT 2023


https://github.com/DavidSpickett updated https://github.com/llvm/llvm-project/pull/70205

>From 80e01960f247ab9ee06daa59d9c033fda5fc3978 Mon Sep 17 00:00:00 2001
From: David Spickett <david.spickett at linaro.org>
Date: Tue, 3 Oct 2023 13:24:39 +0100
Subject: [PATCH 1/3] [lldb][AArch64] Add SME2's ZT0 register

SME2 is documented as part of the main SME supplement:
https://developer.arm.com/documentation/ddi0616/latest/

The one change for debug is this new ZT0 register. This register
contains data to be used with new table lookup instructions.
It's size is always 512 bits (not scalable) and can be
interpreted in many different ways depending on the instructions
that use it.

The kernel has implemented this as a new register set containing
this single register. It always returns register data (with no header,
unlike ZA which does have a header).

https://docs.kernel.org/arch/arm64/sme.html

ZT0 is only active when ZA is active (when SVCR.ZA is 1). In the
inactive state the kernel returns 0s for its contents. Therefore
lldb doesn't need to create 0s like it does for ZA.

However, we will skip restoring the value of ZT0 if we know that
ZA is inactive. As writing to an inactive ZT0 sets SVCR.ZA to 1,
which is not desireable as it would activate ZA also. Whether
SVCR.ZA is set will be determined only by the ZA data we restore.

Due to this, I've added a new save/restore kind SME2. This is easier
than accounting for the variable length ZA in the SME data. We'll only
save an SME2 data block if ZA is active. If it's not we can get fresh
0s back from the kernel for ZT0 anyway so there's nothing for us to restore.

This new register will only show up if the system has SME2 therefore
the SME set presented to the user may change, and I've had to account
for that in in a few places.

I've referred to it internally as simply "ZT" as the kernel does in
NT_ARM_ZT, but the architecture refers to the specific register as "ZT0"
so that's what you'll see in lldb.

```
(lldb) register read -s 6
Scalable Matrix Extension Registers:
      svcr = 0x0000000000000000
       svg = 0x0000000000000004
        za = {0x00 <...> 0x00}
       zt0 = {0x00 <...> 0x00}
```
---
 .../NativeRegisterContextLinux_arm64.cpp      | 133 ++++++++++++++++--
 .../Linux/NativeRegisterContextLinux_arm64.h  |  12 ++
 .../Utility/RegisterInfoPOSIX_arm64.cpp       |  50 +++++--
 .../Process/Utility/RegisterInfoPOSIX_arm64.h |   5 +-
 .../Process/elf-core/RegisterUtilities.h      |   4 +
 5 files changed, 177 insertions(+), 27 deletions(-)

diff --git a/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp b/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
index b5210c368144206..d4c83dcba6363f5 100644
--- a/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
+++ b/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
@@ -45,6 +45,11 @@
 #define NT_ARM_ZA 0x40c /* ARM Scalable Matrix Extension, Array Storage */
 #endif
 
+#ifndef NT_ARM_ZT
+#define NT_ARM_ZT                                                              \
+  0x40d /* ARM Scalable Matrix Extension 2, lookup table register */
+#endif
+
 #ifndef NT_ARM_PAC_MASK
 #define NT_ARM_PAC_MASK 0x406 /* Pointer authentication code masks */
 #endif
@@ -104,6 +109,17 @@ NativeRegisterContextLinux::CreateHostNativeRegisterContextLinux(
             .Success())
       opt_regsets.Set(RegisterInfoPOSIX_arm64::eRegsetMaskZA);
 
+    // SME's ZT0 is a 512 bit register.
+    std::array<uint8_t, 64> zt_reg;
+    ioVec.iov_base = zt_reg.data();
+    ioVec.iov_len = zt_reg.size();
+    regset = NT_ARM_ZT;
+    if (NativeProcessLinux::PtraceWrapper(PTRACE_GETREGSET,
+                                          native_thread.GetID(), &regset,
+                                          &ioVec, zt_reg.size())
+            .Success())
+      opt_regsets.Set(RegisterInfoPOSIX_arm64::eRegsetMaskZT);
+
     NativeProcessLinux &process = native_thread.GetProcess();
 
     std::optional<uint64_t> auxv_at_hwcap =
@@ -148,6 +164,7 @@ NativeRegisterContextLinux_arm64::NativeRegisterContextLinux_arm64(
   ::memset(&m_pac_mask, 0, sizeof(m_pac_mask));
   ::memset(&m_tls_regs, 0, sizeof(m_tls_regs));
   ::memset(&m_sme_pseudo_regs, 0, sizeof(m_sme_pseudo_regs));
+  std::fill(m_zt_reg.begin(), m_zt_reg.end(), 0);
 
   m_mte_ctrl_reg = 0;
 
@@ -164,6 +181,7 @@ NativeRegisterContextLinux_arm64::NativeRegisterContextLinux_arm64(
   m_pac_mask_is_valid = false;
   m_mte_ctrl_is_valid = false;
   m_tls_is_valid = false;
+  m_zt_buffer_is_valid = false;
 
   // SME adds the tpidr2 register
   m_tls_size = GetRegisterInfo().IsSSVEEnabled() ? sizeof(m_tls_regs)
@@ -355,6 +373,15 @@ NativeRegisterContextLinux_arm64::ReadRegister(const RegisterInfo *reg_info,
       // storage. Therefore its effective byte offset is always 0 even if it
       // isn't 0 within the SME register set.
       src = (uint8_t *)GetZABuffer() + GetZAHeaderSize();
+    } else if (GetRegisterInfo().IsSMERegZT(reg)) {
+      // Unlike ZA, the kernel will return register data for ZT0 when ZA is not
+      // enabled. This data will be all 0s so we don't have to invent anything
+      // like we did for ZA.
+      error = ReadZT();
+      if (error.Fail())
+        return error;
+
+      src = (uint8_t *)GetZTBuffer();
     } else {
       error = ReadSMESVG();
       if (error.Fail())
@@ -552,22 +579,31 @@ Status NativeRegisterContextLinux_arm64::WriteRegister(
 
     return WriteTLS();
   } else if (IsSME(reg)) {
-    if (!GetRegisterInfo().IsSMERegZA(reg))
-      return Status("Writing to SVG or SVCR is not supported.");
+    if (GetRegisterInfo().IsSMERegZA(reg)) {
+      error = ReadZA();
+      if (error.Fail())
+        return error;
 
-    error = ReadZA();
-    if (error.Fail())
-      return error;
+      // ZA is part of the SME set but not stored with the other SME registers.
+      // So its byte offset is effectively always 0.
+      dst = (uint8_t *)GetZABuffer() + GetZAHeaderSize();
+      ::memcpy(dst, reg_value.GetBytes(), reg_info->byte_size);
 
-    // ZA is part of the SME set but not stored with the other SME registers.
-    // So its byte offset is effectively always 0.
-    dst = (uint8_t *)GetZABuffer() + GetZAHeaderSize();
-    ::memcpy(dst, reg_value.GetBytes(), reg_info->byte_size);
+      // While this is writing a header that contains a vector length, the only
+      // way to change that is via the vg register. So here we assume the length
+      // will always be the current length and no reconfigure is needed.
+      return WriteZA();
+    } else if (GetRegisterInfo().IsSMERegZT(reg)) {
+      error = ReadZT();
+      if (error.Fail())
+        return error;
 
-    // While this is writing a header that contains a vector length, the only
-    // way to change that is via the vg register. So here we assume the length
-    // will always be the current length and no reconfigure is needed.
-    return WriteZA();
+      dst = (uint8_t *)GetZTBuffer();
+      ::memcpy(dst, reg_value.GetBytes(), reg_info->byte_size);
+
+      return WriteZT();
+    } else
+      return Status("Writing to SVG or SVCR is not supported.");
   }
 
   return Status("Failed to write register value");
@@ -580,7 +616,8 @@ enum RegisterSetType : uint32_t {
   // Pointer authentication registers are read only, so not included here.
   MTE,
   TLS,
-  SME, // ZA only, SVCR and SVG are pseudo registers.
+  SME,  // ZA only , SVCR and SVG are pseudo registers.
+  SME2, // ZT only.
 };
 
 static uint8_t *AddRegisterSetType(uint8_t *dst,
@@ -625,6 +662,18 @@ NativeRegisterContextLinux_arm64::CacheAllRegisters(uint32_t &cached_size) {
     error = ReadZA();
     if (error.Fail())
       return error;
+
+    // We will only be restoring ZT data if ZA is active. As writing to an
+    // inactive ZT enables ZA, which may not be desireable.
+    if (GetRegisterInfo().IsZTEnabled() &&
+        m_za_header.size > sizeof(m_za_header)) {
+      cached_size += sizeof(RegisterSetType) + GetZTBufferSize();
+      // Unlike ZA where we have to fake data for an inactive ZA, the kernel
+      // handles an inactive ZA for us and always returns some data.
+      error = ReadZT();
+      if (error.Fail())
+        return error;
+    }
   }
 
   // If SVE is enabled we need not copy FPR separately.
@@ -732,6 +781,15 @@ Status NativeRegisterContextLinux_arm64::ReadAllRegisterValues(
                             m_za_header.size);
   }
 
+  // If ZT is present and we are going to be restoring an active ZA (which
+  // implies an active ZT0), then restore ZT0 after ZA has been set. This
+  // prevents us enabling ZA accidentally after the restore of ZA disabled it.
+  // If we leave ZA/ZT0 inactive and read ZT0, the kernel returns 0s. Therefore
+  // there's nothing for us to restore if ZA was originally inactive.
+  if (GetRegisterInfo().IsZTEnabled() && m_za_header.size > sizeof(m_za_header))
+    dst = AddSavedRegisters(dst, RegisterSetType::SME2, GetZTBuffer(),
+                            GetZTBufferSize());
+
   if (GetRegisterInfo().IsMTEEnabled()) {
     dst = AddSavedRegisters(dst, RegisterSetType::MTE, GetMTEControl(),
                             GetMTEControlSize());
@@ -875,6 +933,14 @@ Status NativeRegisterContextLinux_arm64::WriteAllRegisterValues(
       error = ReadZA();
       src += GetZABufferSize();
       break;
+    case RegisterSetType::SME2:
+      // Doing this would activate an inactive ZA, however we will only get here
+      // if the state we are restoring had an active ZA. Restoring ZT0 will
+      // always come after restoring ZA.
+      error = RestoreRegisters(
+          GetZTBuffer(), &src, GetZTBufferSize(), m_zt_buffer_is_valid,
+          std::bind(&NativeRegisterContextLinux_arm64::WriteZT, this));
+      break;
     }
 
     if (error.Fail())
@@ -1064,6 +1130,7 @@ void NativeRegisterContextLinux_arm64::InvalidateAllRegisters() {
   m_pac_mask_is_valid = false;
   m_mte_ctrl_is_valid = false;
   m_tls_is_valid = false;
+  m_zt_buffer_is_valid = false;
 
   // Update SVE and ZA registers in case there is change in configuration.
   ConfigureRegisterContext();
@@ -1301,10 +1368,48 @@ Status NativeRegisterContextLinux_arm64::WriteZA() {
 
   m_za_buffer_is_valid = false;
   m_za_header_is_valid = false;
+  // Writing to ZA may enable ZA, which means ZT0 may change too.
+  m_zt_buffer_is_valid = false;
 
   return WriteRegisterSet(&ioVec, GetZABufferSize(), NT_ARM_ZA);
 }
 
+Status NativeRegisterContextLinux_arm64::ReadZT() {
+  Status error;
+
+  if (m_zt_buffer_is_valid)
+    return error;
+
+  struct iovec ioVec;
+  ioVec.iov_base = GetZTBuffer();
+  ioVec.iov_len = GetZTBufferSize();
+
+  error = ReadRegisterSet(&ioVec, GetZTBufferSize(), NT_ARM_ZT);
+  m_zt_buffer_is_valid = error.Success();
+
+  return error;
+}
+
+Status NativeRegisterContextLinux_arm64::WriteZT() {
+  Status error;
+
+  error = ReadZT();
+  if (error.Fail())
+    return error;
+
+  struct iovec ioVec;
+  ioVec.iov_base = GetZTBuffer();
+  ioVec.iov_len = GetZTBufferSize();
+
+  m_zt_buffer_is_valid = false;
+  // Writing to an inactive ZT0 will enable ZA as well, which invalidates our
+  // current copy of it.
+  m_za_buffer_is_valid = false;
+  m_za_header_is_valid = false;
+
+  return WriteRegisterSet(&ioVec, GetZTBufferSize(), NT_ARM_ZT);
+}
+
 void NativeRegisterContextLinux_arm64::ConfigureRegisterContext() {
   // ConfigureRegisterContext gets called from InvalidateAllRegisters
   // on every stop and configures SVE vector length and whether we are in
diff --git a/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.h b/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.h
index 9fc2e7c1d2b9d75..6df7c3beefb8243 100644
--- a/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.h
+++ b/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.h
@@ -83,6 +83,7 @@ class NativeRegisterContextLinux_arm64
   bool m_fpu_is_valid;
   bool m_sve_buffer_is_valid;
   bool m_mte_ctrl_is_valid;
+  bool m_zt_buffer_is_valid;
 
   bool m_sve_header_is_valid;
   bool m_za_buffer_is_valid;
@@ -129,6 +130,9 @@ class NativeRegisterContextLinux_arm64
 
   struct tls_regs m_tls_regs;
 
+  // SME2's ZT is a 512 bit register.
+  std::array<uint8_t, 64> m_zt_reg;
+
   bool IsGPR(unsigned reg) const;
 
   bool IsFPR(unsigned reg) const;
@@ -163,6 +167,10 @@ class NativeRegisterContextLinux_arm64
   // Instead use WriteZA and ensure you have the correct ZA buffer size set
   // beforehand if you wish to disable it.
 
+  Status ReadZT();
+
+  Status WriteZT();
+
   // SVCR is a pseudo register and we do not allow writes to it.
   Status ReadSMEControl();
 
@@ -190,6 +198,8 @@ class NativeRegisterContextLinux_arm64
 
   void *GetSMEPseudoBuffer() { return &m_sme_pseudo_regs; }
 
+  void *GetZTBuffer() { return m_zt_reg.data(); }
+
   void *GetSVEBuffer() { return m_sve_ptrace_payload.data(); }
 
   size_t GetSVEHeaderSize() { return sizeof(m_sve_header); }
@@ -210,6 +220,8 @@ class NativeRegisterContextLinux_arm64
 
   size_t GetSMEPseudoBufferSize() { return sizeof(m_sme_pseudo_regs); }
 
+  size_t GetZTBufferSize() { return m_zt_reg.size(); }
+
   llvm::Error ReadHardwareDebugInfo() override;
 
   llvm::Error WriteHardwareDebugRegs(DREGType hwbType) override;
diff --git a/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp b/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
index 20ce4df3027acf1..425a4b67296f627 100644
--- a/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
+++ b/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
@@ -90,6 +90,10 @@ static lldb_private::RegisterInfo g_register_infos_sme[] = {
     {"za", nullptr, 16, 0, lldb::eEncodingVector, lldb::eFormatVectorOfUInt8,
      KIND_ALL_INVALID, nullptr, nullptr, nullptr}};
 
+static lldb_private::RegisterInfo g_register_infos_sme2[] = {
+    {"zt0", nullptr, 64, 0, lldb::eEncodingVector, lldb::eFormatVectorOfUInt8,
+     KIND_ALL_INVALID, nullptr, nullptr, nullptr}};
+
 // Number of register sets provided by this context.
 enum {
   k_num_gpr_registers = gpr_w28 - gpr_x0 + 1,
@@ -98,6 +102,8 @@ enum {
   k_num_mte_register = 1,
   // Number of TLS registers is dynamic so it is not listed here.
   k_num_pauth_register = 2,
+  // SME2's ZT0 will also be added to this set if present. So this number is
+  // only for SME1 registers.
   k_num_sme_register = 3,
   k_num_register_sets_default = 2,
   k_num_register_sets = 3
@@ -253,7 +259,7 @@ RegisterInfoPOSIX_arm64::RegisterInfoPOSIX_arm64(
       AddRegSetTLS(m_opt_regsets.AllSet(eRegsetMaskSSVE));
 
       if (m_opt_regsets.AnySet(eRegsetMaskSSVE))
-        AddRegSetSME();
+        AddRegSetSME(m_opt_regsets.AnySet(eRegsetMaskZT));
 
       m_register_info_count = m_dynamic_reg_infos.size();
       m_register_info_p = m_dynamic_reg_infos.data();
@@ -358,21 +364,35 @@ void RegisterInfoPOSIX_arm64::AddRegSetTLS(bool has_tpidr2) {
   m_dynamic_reg_sets.back().registers = m_tls_regnum_collection.data();
 }
 
-void RegisterInfoPOSIX_arm64::AddRegSetSME() {
-  uint32_t sme_regnum = m_dynamic_reg_infos.size();
-  for (uint32_t i = 0; i < k_num_sme_register; i++) {
-    m_sme_regnum_collection.push_back(sme_regnum + i);
+void RegisterInfoPOSIX_arm64::AddRegSetSME(bool has_zt) {
+  const uint32_t first_sme_regnum = m_dynamic_reg_infos.size();
+  uint32_t sme_regnum = first_sme_regnum;
+
+  for (uint32_t i = 0; i < k_num_sme_register; ++i, ++sme_regnum) {
+    m_sme_regnum_collection.push_back(sme_regnum);
     m_dynamic_reg_infos.push_back(g_register_infos_sme[i]);
-    m_dynamic_reg_infos[sme_regnum + i].byte_offset =
-        m_dynamic_reg_infos[sme_regnum + i - 1].byte_offset +
-        m_dynamic_reg_infos[sme_regnum + i - 1].byte_size;
-    m_dynamic_reg_infos[sme_regnum + i].kinds[lldb::eRegisterKindLLDB] =
-        sme_regnum + i;
+    m_dynamic_reg_infos[sme_regnum].byte_offset =
+        m_dynamic_reg_infos[sme_regnum - 1].byte_offset +
+        m_dynamic_reg_infos[sme_regnum - 1].byte_size;
+    m_dynamic_reg_infos[sme_regnum].kinds[lldb::eRegisterKindLLDB] = sme_regnum;
+  }
+
+  lldb_private::RegisterSet sme_regset = g_reg_set_sme_arm64;
+
+  if (has_zt) {
+    m_sme_regnum_collection.push_back(sme_regnum);
+    m_dynamic_reg_infos.push_back(g_register_infos_sme2[0]);
+    m_dynamic_reg_infos[sme_regnum].byte_offset =
+        m_dynamic_reg_infos[sme_regnum - 1].byte_offset +
+        m_dynamic_reg_infos[sme_regnum - 1].byte_size;
+    m_dynamic_reg_infos[sme_regnum].kinds[lldb::eRegisterKindLLDB] = sme_regnum;
+
+    sme_regset.num_registers += 1;
   }
 
   m_per_regset_regnum_range[m_register_set_count] =
-      std::make_pair(sme_regnum, m_dynamic_reg_infos.size());
-  m_dynamic_reg_sets.push_back(g_reg_set_sme_arm64);
+      std::make_pair(first_sme_regnum, m_dynamic_reg_infos.size());
+  m_dynamic_reg_sets.push_back(sme_regset);
   m_dynamic_reg_sets.back().registers = m_sme_regnum_collection.data();
 
   // When vg is written during streaming mode, svg will also change, as vg and
@@ -488,6 +508,12 @@ bool RegisterInfoPOSIX_arm64::IsSMERegZA(unsigned reg) const {
   return reg == m_sme_regnum_collection[2];
 }
 
+bool RegisterInfoPOSIX_arm64::IsSMERegZT(unsigned reg) const {
+  // ZT0 is part of the SME register set only if SME2 is present.
+  return m_sme_regnum_collection.size() == 4 &&
+         reg == m_sme_regnum_collection.back();
+}
+
 bool RegisterInfoPOSIX_arm64::IsPAuthReg(unsigned reg) const {
   return llvm::is_contained(pauth_regnum_collection, reg);
 }
diff --git a/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h b/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
index debdf4c76abc25b..1f205475d9b71cc 100644
--- a/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
+++ b/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.h
@@ -31,6 +31,7 @@ class RegisterInfoPOSIX_arm64
     eRegsetMaskMTE = 8,
     eRegsetMaskTLS = 16,
     eRegsetMaskZA = 32,
+    eRegsetMaskZT = 64,
     eRegsetMaskDynamic = ~1,
   };
 
@@ -107,7 +108,7 @@ class RegisterInfoPOSIX_arm64
 
   void AddRegSetTLS(bool has_tpidr2);
 
-  void AddRegSetSME();
+  void AddRegSetSME(bool has_zt);
 
   uint32_t ConfigureVectorLengthSVE(uint32_t sve_vq);
 
@@ -123,6 +124,7 @@ class RegisterInfoPOSIX_arm64
   bool IsSVEEnabled() const { return m_opt_regsets.AnySet(eRegsetMaskSVE); }
   bool IsSSVEEnabled() const { return m_opt_regsets.AnySet(eRegsetMaskSSVE); }
   bool IsZAEnabled() const { return m_opt_regsets.AnySet(eRegsetMaskZA); }
+  bool IsZTEnabled() const { return m_opt_regsets.AnySet(eRegsetMaskZT); }
   bool IsPAuthEnabled() const { return m_opt_regsets.AnySet(eRegsetMaskPAuth); }
   bool IsMTEEnabled() const { return m_opt_regsets.AnySet(eRegsetMaskMTE); }
   bool IsTLSEnabled() const { return m_opt_regsets.AnySet(eRegsetMaskTLS); }
@@ -136,6 +138,7 @@ class RegisterInfoPOSIX_arm64
   bool IsTLSReg(unsigned reg) const;
   bool IsSMEReg(unsigned reg) const;
   bool IsSMERegZA(unsigned reg) const;
+  bool IsSMERegZT(unsigned reg) const;
 
   uint32_t GetRegNumSVEZ0() const;
   uint32_t GetRegNumSVEFFR() const;
diff --git a/lldb/source/Plugins/Process/elf-core/RegisterUtilities.h b/lldb/source/Plugins/Process/elf-core/RegisterUtilities.h
index 4b01a56979bb1c4..12aa5f72371c510 100644
--- a/lldb/source/Plugins/Process/elf-core/RegisterUtilities.h
+++ b/lldb/source/Plugins/Process/elf-core/RegisterUtilities.h
@@ -127,6 +127,10 @@ constexpr RegsetDesc AARCH64_ZA_Desc[] = {
     {llvm::Triple::Linux, llvm::Triple::aarch64, llvm::ELF::NT_ARM_ZA},
 };
 
+constexpr RegsetDesc AARCH64_ZT_Desc[] = {
+    {llvm::Triple::Linux, llvm::Triple::aarch64, llvm::ELF::NT_ARM_ZT},
+};
+
 constexpr RegsetDesc AARCH64_PAC_Desc[] = {
     {llvm::Triple::Linux, llvm::Triple::aarch64, llvm::ELF::NT_ARM_PAC_MASK},
 };

>From 9624d376abc5fd0cb89bc157ffce6c114263101f Mon Sep 17 00:00:00 2001
From: David Spickett <david.spickett at linaro.org>
Date: Wed, 4 Oct 2023 13:52:06 +0100
Subject: [PATCH 2/3] (intended to be squashed later)

Add testing for ZT0.
* ZA save restore test now checks ZT0 as well.
* Added a new test for writing to ZT0 to enable ZA and ZT0.
* Moved tests into a top level SME Z register folder
  now that we are testing ZA and ZT0.
---
 .../Python/lldbsuite/test/lldbtest.py         |  4 +
 .../TestArm64DynamicRegsets.py                | 88 ++++++++++++++++---
 .../save_restore}/Makefile                    |  2 +-
 .../TestSMEZRegistersSaveRestore.py}          | 24 ++++-
 .../save_restore}/main.c                      | 31 +++++--
 .../za_dynamic_resize/Makefile                |  0
 .../TestZAThreadedDynamic.py                  |  0
 .../za_dynamic_resize/main.c                  |  0
 8 files changed, 128 insertions(+), 21 deletions(-)
 rename lldb/test/API/commands/register/register/{aarch64_za_register/za_save_restore => aarch64_sme_z_registers/save_restore}/Makefile (50%)
 rename lldb/test/API/commands/register/register/{aarch64_za_register/za_save_restore/TestZARegisterSaveRestore.py => aarch64_sme_z_registers/save_restore/TestSMEZRegistersSaveRestore.py} (91%)
 rename lldb/test/API/commands/register/register/{aarch64_za_register/za_save_restore => aarch64_sme_z_registers/save_restore}/main.c (91%)
 rename lldb/test/API/commands/register/register/{aarch64_za_register => aarch64_sme_z_registers}/za_dynamic_resize/Makefile (100%)
 rename lldb/test/API/commands/register/register/{aarch64_za_register => aarch64_sme_z_registers}/za_dynamic_resize/TestZAThreadedDynamic.py (100%)
 rename lldb/test/API/commands/register/register/{aarch64_za_register => aarch64_sme_z_registers}/za_dynamic_resize/main.c (100%)

diff --git a/lldb/packages/Python/lldbsuite/test/lldbtest.py b/lldb/packages/Python/lldbsuite/test/lldbtest.py
index fefe91401cb8e52..15e8ba21266c896 100644
--- a/lldb/packages/Python/lldbsuite/test/lldbtest.py
+++ b/lldb/packages/Python/lldbsuite/test/lldbtest.py
@@ -1271,6 +1271,10 @@ def isAArch64SVE(self):
     def isAArch64SME(self):
         return self.isAArch64() and "sme" in self.getCPUInfo()
 
+    def isAArch64SME2(self):
+        # If you have sme2, you also have sme.
+        return self.isAArch64() and "sme2" in self.getCPUInfo()
+
     def isAArch64SMEFA64(self):
         # smefa64 allows the use of the full A64 instruction set in streaming
         # mode. This is required by certain test programs to setup register
diff --git a/lldb/test/API/commands/register/register/aarch64_dynamic_regset/TestArm64DynamicRegsets.py b/lldb/test/API/commands/register/register/aarch64_dynamic_regset/TestArm64DynamicRegsets.py
index 0ad69c268a9fd29..eb121ecbfdbaf85 100644
--- a/lldb/test/API/commands/register/register/aarch64_dynamic_regset/TestArm64DynamicRegsets.py
+++ b/lldb/test/API/commands/register/register/aarch64_dynamic_regset/TestArm64DynamicRegsets.py
@@ -136,6 +136,14 @@ def make_za_value(self, vl, generator):
             rows.append(" ".join([byte] * vl))
         return "{" + " ".join(rows) + "}"
 
+    def make_zt0_value(self, generator):
+        num_bytes = 512 // 8
+        elements = []
+        for i in range(num_bytes):
+            elements.append("0x{:02x}".format(generator(i)))
+
+        return "{" + " ".join(elements) + "}"
+
     @no_debug_info_test
     @skipIf(archs=no_match(["aarch64"]))
     @skipIf(oslist=no_match(["linux"]))
@@ -179,16 +187,11 @@ def test_aarch64_dynamic_regset_config_sme(self):
 
         # SVCR is read only so we do not test writing to it.
 
-    @no_debug_info_test
-    @skipIf(archs=no_match(["aarch64"]))
-    @skipIf(oslist=no_match(["linux"]))
-    def test_aarch64_dynamic_regset_config_sme_za_disabled(self):
-        """Test that ZA shows as 0s when disabled and can be enabled by writing
-        to it."""
-        if not self.isAArch64SME():
-            self.skipTest("SME must be present.")
+    def write_to_enable_za_test(self, has_zt0, write_za_first):
+        # Run a test where we start with ZA disabled, and write to either ZA
+        # or ZT0 which causes them to become enabled.
 
-        # No argument, so ZA will be disabled when we break.
+        # No argument, so ZA and ZT0 will be disabled when we break.
         register_sets = self.setup_register_config_test()
 
         # vg is the non-streaming vg as we are in non-streaming mode, so we need
@@ -205,14 +208,71 @@ def test_aarch64_dynamic_regset_config_sme_za_disabled(self):
 
         svl = svg * 8
         # A disabled ZA is shown as all 0s.
-        self.expect("register read za", substrs=[self.make_za_value(svl, lambda r: 0)])
+        disabled_za = self.make_za_value(svl, lambda r: 0)
+        self.expect("register read za", substrs=[disabled_za])
+
+        disabled_zt0 = self.make_zt0_value(lambda n: 0)
+        if has_zt0:
+            # A disabled zt0 is all 0s.
+            self.expect("register read zt0", substrs=[disabled_zt0])
+
+        # Writing to ZA or ZTO enables both and we should be able to read the
+        # value back.
         za_value = self.make_za_value(svl, lambda r: r + 1)
-        # Writing to it enables ZA, so the value should be there when we read
-        # it back.
-        self.runCmd("register write za '{}'".format(za_value))
-        self.expect("register read za", substrs=[za_value])
+        zt0_value = self.make_zt0_value(lambda n: n + 1)
+
+        if write_za_first:
+            # This enables ZA and ZT0.
+            self.runCmd("register write za '{}'".format(za_value))
+            self.expect("register read za", substrs=[za_value])
+
+            if has_zt0:
+                # ZT0 is still 0s at this point, though it is active.
+                self.expect("register read zt0", substrs=[disabled_zt0])
+
+                # Now write ZT0 to we can check it reads back correctly.
+                self.runCmd("register write zt0 '{}'".format(zt0_value))
+                self.expect("register read zt0", substrs=[zt0_value])
+        else:
+            if not has_zt0:
+                self.fail("Cannot write to zt0 when sme2 is not present.")
+
+            # Instead use the write of ZT0 to activate ZA.
+            self.runCmd("register write zt0 '{}'".format(zt0_value))
+            self.expect("register read zt0", substrs=[zt0_value])
+
+            # ZA will be active but 0s at this point, but it is active.
+            self.expect("register read zt0", substrs=[disabled_za])
+
+            # Write and read back ZA.
+            self.runCmd("register write za '{}'".format(za_value))
+            self.expect("register read za", substrs=[za_value])
 
         # Now SVCR.ZA should be set, which is bit 1.
         self.expect("register read svcr", substrs=["0x0000000000000002"])
 
         # SVCR is read only so we do not test writing to it.
+
+    @no_debug_info_test
+    @skipIf(archs=no_match(["aarch64"]))
+    @skipIf(oslist=no_match(["linux"]))
+    def test_aarch64_dynamic_regset_config_sme_write_za_to_enable(self):
+        """Test that ZA and ZT0 (if present) shows as 0s when disabled and
+        can be enabled by writing to ZA."""
+        if not self.isAArch64SME():
+            self.skipTest("SME must be present.")
+
+        self.write_to_enable_za_test(self.isAArch64SME2(), True)
+
+    @no_debug_info_test
+    @skipIf(archs=no_match(["aarch64"]))
+    @skipIf(oslist=no_match(["linux"]))
+    def test_aarch64_dynamic_regset_config_sme_write_zt0_to_enable(self):
+        """Test that ZA and ZT0 (if present) shows as 0s when disabled and
+        can be enabled by writing to ZT0."""
+        if not self.isAArch64SME():
+            self.skipTest("SME must be present.")
+        if not self.isAArch64SME2():
+            self.skipTest("SME2 must be present.")
+
+        self.write_to_enable_za_test(True, True)
diff --git a/lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/Makefile b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/Makefile
similarity index 50%
rename from lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/Makefile
rename to lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/Makefile
index f2ca08f3531aa16..85304c044dab4d0 100644
--- a/lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/Makefile
+++ b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/Makefile
@@ -1,5 +1,5 @@
 C_SOURCES := main.c
 
-CFLAGS_EXTRAS := -march=armv8-a+sve+sme
+CFLAGS_EXTRAS := -march=armv8-a+sve+sme+sme2
 
 include Makefile.rules
diff --git a/lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/TestZARegisterSaveRestore.py b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/TestSMEZRegistersSaveRestore.py
similarity index 91%
rename from lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/TestZARegisterSaveRestore.py
rename to lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/TestSMEZRegistersSaveRestore.py
index a647c91f71119ec..a0949d80c56640a 100644
--- a/lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/TestZARegisterSaveRestore.py
+++ b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/TestSMEZRegistersSaveRestore.py
@@ -1,5 +1,6 @@
 """
-Test the AArch64 SME ZA register is saved and restored around expressions.
+Test the AArch64 SME ZA and ZT0 registers are saved and restored around
+expressions.
 
 This attempts to cover expressions that change the following:
 * ZA enabled or not.
@@ -105,6 +106,21 @@ def check_za_disabled(self, vl):
         # When ZA is disabled, lldb will show ZA as all 0s.
         self.expect("register read za", substrs=[self.make_za_value(vl, lambda row: 0)])
 
+    def make_zt0_value(self, generator):
+        num_bytes = 512 // 8
+        elements = []
+        for i in range(num_bytes):
+            elements.append("0x{:02x}".format(generator(i)))
+
+        return "{" + " ".join(elements) + "}"
+
+    def check_zt0(self):
+        self.expect("register read zt0", substrs=[self.make_zt0_value(lambda n: n + 1)])
+
+    def check_zt0_disabled(self):
+        # Like ZA, zt0 reads as 0 when SVCR.ZA is not set.
+        self.expect("register read zt0", substrs=[self.make_zt0_value(lambda n: 0)])
+
     def za_expr_test_impl(self, sve_mode, za_state, swap_start_vl):
         if not self.isAArch64SMEFA64():
             self.skipTest("SME and the smefa64 extension must be present.")
@@ -168,11 +184,17 @@ def za_expr_test_impl(self, sve_mode, za_state, swap_start_vl):
         if za_state == ZA.Enabled:
             svcr_value += 2
 
+        has_zt0 = self.isAArch64SME2()
+
         def check_regs():
             if za_state == ZA.Enabled:
                 self.check_za(start_vl)
+                if has_zt0:
+                    self.check_zt0()
             else:
                 self.check_za_disabled(start_vl)
+                if has_zt0:
+                    self.check_zt0_disabled()
 
             # svg and vg are in units of 8 bytes.
             self.assertEqual(start_vl, self.read_svg() * 8)
diff --git a/lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/main.c b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/main.c
similarity index 91%
rename from lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/main.c
rename to lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/main.c
index a8434787a5a1235..7dd424798d693d6 100644
--- a/lldb/test/API/commands/register/register/aarch64_za_register/za_save_restore/main.c
+++ b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/save_restore/main.c
@@ -1,7 +1,9 @@
+#include <asm/hwcap.h>
 #include <stdbool.h>
 #include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
+#include <sys/auxv.h>
 #include <sys/prctl.h>
 
 // Important details for this program:
@@ -30,8 +32,13 @@
 #define SMSTOP_SM SM_INST(2)
 #define SMSTOP_ZA SM_INST(4)
 
+#ifndef HWCAP2_SME2
+#define HWCAP2_SME2 (1UL << 37)
+#endif
+
 int start_vl = 0;
 int other_vl = 0;
+bool has_zt0 = false;
 
 void write_sve_regs() {
   // We assume the smefa64 feature is present, which allows ffr access
@@ -143,7 +150,7 @@ void write_sve_regs_expr() {
   asm volatile("cpy  z31.b, p15/z, #33\n\t");
 }
 
-void set_za_register(int svl, int value_offset) {
+void set_sme_registers(int svl, uint8_t value_offset) {
 #define MAX_VL_BYTES 256
   uint8_t data[MAX_VL_BYTES];
 
@@ -158,6 +165,17 @@ void set_za_register(int svl, int value_offset) {
                  "r"(&data)
                  : "w12");
   }
+#undef MAX_VL_BYTES
+
+  if (has_zt0) {
+#define ZTO_LEN (512 / 8)
+    uint8_t data[ZTO_LEN];
+    for (unsigned i = 0; i < ZTO_LEN; ++i)
+      data[i] = i + value_offset;
+
+    asm volatile("ldr zt0, [%0]" ::"r"(&data));
+#undef ZT0_LEN
+  }
 }
 
 void expr_disable_za() {
@@ -167,21 +185,21 @@ void expr_disable_za() {
 
 void expr_enable_za() {
   SMSTART_ZA;
-  set_za_register(start_vl, 2);
+  set_sme_registers(start_vl, 2);
   write_sve_regs_expr();
 }
 
 void expr_start_vl() {
   prctl(PR_SME_SET_VL, start_vl);
   SMSTART_ZA;
-  set_za_register(start_vl, 4);
+  set_sme_registers(start_vl, 4);
   write_sve_regs_expr();
 }
 
 void expr_other_vl() {
   prctl(PR_SME_SET_VL, other_vl);
   SMSTART_ZA;
-  set_za_register(other_vl, 5);
+  set_sme_registers(other_vl, 5);
   write_sve_regs_expr();
 }
 
@@ -209,6 +227,9 @@ int main(int argc, char *argv[]) {
   start_vl = atoi(argv[3]);
   other_vl = atoi(argv[4]);
 
+  if ((getauxval(AT_HWCAP2) & HWCAP2_SME2))
+    has_zt0 = true;
+
   prctl(PR_SME_SET_VL, start_vl);
 
   if (ssve)
@@ -216,7 +237,7 @@ int main(int argc, char *argv[]) {
 
   if (za) {
     SMSTART_ZA;
-    set_za_register(start_vl, 1);
+    set_sme_registers(start_vl, 1);
   }
 
   write_sve_regs();
diff --git a/lldb/test/API/commands/register/register/aarch64_za_register/za_dynamic_resize/Makefile b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/za_dynamic_resize/Makefile
similarity index 100%
rename from lldb/test/API/commands/register/register/aarch64_za_register/za_dynamic_resize/Makefile
rename to lldb/test/API/commands/register/register/aarch64_sme_z_registers/za_dynamic_resize/Makefile
diff --git a/lldb/test/API/commands/register/register/aarch64_za_register/za_dynamic_resize/TestZAThreadedDynamic.py b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/za_dynamic_resize/TestZAThreadedDynamic.py
similarity index 100%
rename from lldb/test/API/commands/register/register/aarch64_za_register/za_dynamic_resize/TestZAThreadedDynamic.py
rename to lldb/test/API/commands/register/register/aarch64_sme_z_registers/za_dynamic_resize/TestZAThreadedDynamic.py
diff --git a/lldb/test/API/commands/register/register/aarch64_za_register/za_dynamic_resize/main.c b/lldb/test/API/commands/register/register/aarch64_sme_z_registers/za_dynamic_resize/main.c
similarity index 100%
rename from lldb/test/API/commands/register/register/aarch64_za_register/za_dynamic_resize/main.c
rename to lldb/test/API/commands/register/register/aarch64_sme_z_registers/za_dynamic_resize/main.c

>From 3ed10ca4d37abce5114fe22110b587a0c78d2d0a Mon Sep 17 00:00:00 2001
From: David Spickett <david.spickett at linaro.org>
Date: Thu, 26 Oct 2023 09:22:34 +0100
Subject: [PATCH 3/3] Add comments to explain confusing enabled/present checks.

---
 .../Linux/NativeRegisterContextLinux_arm64.cpp      | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp b/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
index d4c83dcba6363f5..b4ae5dd7c35c9cf 100644
--- a/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
+++ b/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
@@ -665,7 +665,10 @@ NativeRegisterContextLinux_arm64::CacheAllRegisters(uint32_t &cached_size) {
 
     // We will only be restoring ZT data if ZA is active. As writing to an
     // inactive ZT enables ZA, which may not be desireable.
-    if (GetRegisterInfo().IsZTEnabled() &&
+    if (
+      // If we have ZT0, or in other words, if we have SME2.
+      GetRegisterInfo().IsZTEnabled() &&
+      // And ZA is active, which means that ZT0 is also active.
         m_za_header.size > sizeof(m_za_header)) {
       cached_size += sizeof(RegisterSetType) + GetZTBufferSize();
       // Unlike ZA where we have to fake data for an inactive ZA, the kernel
@@ -781,12 +784,16 @@ Status NativeRegisterContextLinux_arm64::ReadAllRegisterValues(
                             m_za_header.size);
   }
 
-  // If ZT is present and we are going to be restoring an active ZA (which
+  // If ZT0 is present and we are going to be restoring an active ZA (which
   // implies an active ZT0), then restore ZT0 after ZA has been set. This
   // prevents us enabling ZA accidentally after the restore of ZA disabled it.
   // If we leave ZA/ZT0 inactive and read ZT0, the kernel returns 0s. Therefore
   // there's nothing for us to restore if ZA was originally inactive.
-  if (GetRegisterInfo().IsZTEnabled() && m_za_header.size > sizeof(m_za_header))
+  if (
+    // If we have SME2 and therefore ZT0.
+    GetRegisterInfo().IsZTEnabled() &&
+    // And ZA is enabled.
+    m_za_header.size > sizeof(m_za_header))
     dst = AddSavedRegisters(dst, RegisterSetType::SME2, GetZTBuffer(),
                             GetZTBufferSize());
 



More information about the lldb-commits mailing list