[lld] [MTE] [lld] Don't tag symbols in sections with implicit start/stop (PR #73531)

Mitch Phillips via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 27 07:45:04 PST 2023


https://github.com/hctim created https://github.com/llvm/llvm-project/pull/73531

Found after a new Mali GPU driver drop into Android, but I'm guessing
the reason why the implicit start/stop feature exists is to facilitate
exactly this type of iteration.

When GVs are explicitly placed into a section, and that section is a
valid C identifier, then lld will create synthetic _start and _stop
symbols for the bounds of that section. The Mali code in question then
iterates over the GVs by using the _start and _stop symbols as bounds.
This falls over under MTE, as each individual GV has tags (and the
_start and _stop symbols are untagged), and so you get
SIGSEGV/MTE[AS]ERR at runtime.

Given this is probably a by-design pattern, let's exclude any GVs from
being tagged if they're going into a section with implicit start/stop
symbols.


>From 1b23c642357176360b7038df9082373ff076af90 Mon Sep 17 00:00:00 2001
From: Mitch Phillips <mitchp at google.com>
Date: Mon, 27 Nov 2023 16:41:03 +0100
Subject: [PATCH] [MTE] [lld] Don't tag symbols in sections with implicit
 start/stop

Found after a new Mali GPU driver drop into Android, but I'm guessing
the reason why the implicit start/stop feature exists is to facilitate
exactly this type of iteration.

When GVs are explicitly placed into a section, and that section is a
valid C identifier, then lld will create synthetic _start and _stop
symbols for the bounds of that section. The Mali code in question then
iterates over the GVs by using the _start and _stop symbols as bounds.
This falls over under MTE, as each individual GV has tags (and the
_start and _stop symbols are untagged), and so you get
SIGSEGV/MTE[AS]ERR at runtime.

Given this is probably a by-design pattern, let's exclude any GVs from
being tagged if they're going into a section with implicit start/stop
symbols.
---
 lld/ELF/Arch/AArch64.cpp                | 12 +++++++
 lld/test/ELF/aarch64-memtag-startstop.s | 48 +++++++++++++++++++++++++
 2 files changed, 60 insertions(+)
 create mode 100644 lld/test/ELF/aarch64-memtag-startstop.s

diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 048f0ec30ebd283..0dff31c6d0379ed 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -999,6 +999,18 @@ addTaggedSymbolReferences(InputSectionBase &sec,
     // like functions or TLS symbols.
     if (sym.type != STT_OBJECT)
       continue;
+    // Global variables can be explicitly put into sections where the section
+    // name is a valid C identifier. In these cases, the linker will create
+    // implicit __start and __stop symbols for the section. Globals that are in
+    // these special sections often are done there intentionally so they can be
+    // iterated over by going from __start_<secname> -> __stop_<secname>. This
+    // doesn't work under MTE globals, because each GV would have its own unique
+    // tag. So, symbols that are destined for a special section with start/stop
+    // symbols should go untagged implicitly.
+    const Defined* defined_sym = dyn_cast<Defined>(&sym);
+    if (defined_sym && defined_sym->section &&
+        isValidCIdentifier(defined_sym->section->name))
+      continue;
     // STB_LOCAL symbols can't be referenced from outside the object file, and
     // thus don't need to be checked for references from other object files.
     if (sym.binding == STB_LOCAL) {
diff --git a/lld/test/ELF/aarch64-memtag-startstop.s b/lld/test/ELF/aarch64-memtag-startstop.s
new file mode 100644
index 000000000000000..a8bdcd8929d4879
--- /dev/null
+++ b/lld/test/ELF/aarch64-memtag-startstop.s
@@ -0,0 +1,48 @@
+# REQUIRES: aarch64
+# RUN: llvm-mc -filetype=obj -triple=aarch64-none-linux-android %s -o %t
+# RUN: ld.lld %t -o %t.so -shared --android-memtag-mode=sync
+
+## Normally relocations are printed before the symbol tables, so reorder it a
+## bit to make it easier on matching addresses of relocations up with the
+## symbols.
+# RUN: llvm-readelf %t.so -s > %t.out
+# RUN: llvm-readelf %t.so --section-headers --relocs --memtag >> %t.out
+# RUN: FileCheck %s < %t.out
+
+# CHECK:     Symbol table '.dynsym' contains
+# CHECK-DAG: [[#%x,GLOBAL:]] 16 OBJECT GLOBAL DEFAULT [[#]] global{{$}}
+# CHECK-DAG: [[#%x,GLOBAL_IN_SECTION:]] 16 OBJECT GLOBAL DEFAULT [[#]] global_in_section{{$}}
+
+# CHECK:     Section Headers:
+# CHECK:     .memtag.globals.dynamic AARCH64_MEMTAG_GLOBALS_DYNAMIC
+# CHECK-NOT: .memtag.globals.static
+# CHECK-NOT: AARCH64_MEMTAG_GLOBALS_STATIC
+
+# CHECK:      Memtag Dynamic Entries
+# CHECK-NEXT: AARCH64_MEMTAG_MODE: Synchronous (0)
+# CHECK-NEXT: AARCH64_MEMTAG_HEAP: Disabled (0)
+# CHECK-NEXT: AARCH64_MEMTAG_STACK: Disabled (0)
+# CHECK-NEXT: AARCH64_MEMTAG_GLOBALS: 0x{{[0-9a-f]+}}
+# CHECK-NEXT: AARCH64_MEMTAG_GLOBALSSZ: 3
+
+# CHECK:      Memtag Global Descriptors:
+# CHECK-NEXT: 0x[[#GLOBAL]]: 0x10
+# CHECK-NOT:  0x
+
+        .memtag   global
+        .type     global, at object
+        .bss
+        .globl    global
+        .p2align  4, 0x0
+global:
+        .zero     16
+        .size     global, 16
+
+        .memtag   global_in_section                  // @global_in_section
+        .type     global_in_section, at object
+        .section  metadata_strings,"aw", at progbits
+        .globl    global_in_section
+        .p2align  4, 0x0
+global_in_section:
+        .zero     16
+        .size     global_in_section, 16



More information about the llvm-commits mailing list