[lld] [MTE] [lld] Don't tag symbols in sections with implicit start/stop (PR #73531)
Mitch Phillips via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 27 07:49:29 PST 2023
https://github.com/hctim updated https://github.com/llvm/llvm-project/pull/73531
>From 1b23c642357176360b7038df9082373ff076af90 Mon Sep 17 00:00:00 2001
From: Mitch Phillips <mitchp at google.com>
Date: Mon, 27 Nov 2023 16:41:03 +0100
Subject: [PATCH 1/2] [MTE] [lld] Don't tag symbols in sections with implicit
start/stop
Found after a new Mali GPU driver drop into Android, but I'm guessing
the reason why the implicit start/stop feature exists is to facilitate
exactly this type of iteration.
When GVs are explicitly placed into a section, and that section is a
valid C identifier, then lld will create synthetic _start and _stop
symbols for the bounds of that section. The Mali code in question then
iterates over the GVs by using the _start and _stop symbols as bounds.
This falls over under MTE, as each individual GV has tags (and the
_start and _stop symbols are untagged), and so you get
SIGSEGV/MTE[AS]ERR at runtime.
Given this is probably a by-design pattern, let's exclude any GVs from
being tagged if they're going into a section with implicit start/stop
symbols.
---
lld/ELF/Arch/AArch64.cpp | 12 +++++++
lld/test/ELF/aarch64-memtag-startstop.s | 48 +++++++++++++++++++++++++
2 files changed, 60 insertions(+)
create mode 100644 lld/test/ELF/aarch64-memtag-startstop.s
diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 048f0ec30ebd283..0dff31c6d0379ed 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -999,6 +999,18 @@ addTaggedSymbolReferences(InputSectionBase &sec,
// like functions or TLS symbols.
if (sym.type != STT_OBJECT)
continue;
+ // Global variables can be explicitly put into sections where the section
+ // name is a valid C identifier. In these cases, the linker will create
+ // implicit __start and __stop symbols for the section. Globals that are in
+ // these special sections often are done there intentionally so they can be
+ // iterated over by going from __start_<secname> -> __stop_<secname>. This
+ // doesn't work under MTE globals, because each GV would have its own unique
+ // tag. So, symbols that are destined for a special section with start/stop
+ // symbols should go untagged implicitly.
+ const Defined* defined_sym = dyn_cast<Defined>(&sym);
+ if (defined_sym && defined_sym->section &&
+ isValidCIdentifier(defined_sym->section->name))
+ continue;
// STB_LOCAL symbols can't be referenced from outside the object file, and
// thus don't need to be checked for references from other object files.
if (sym.binding == STB_LOCAL) {
diff --git a/lld/test/ELF/aarch64-memtag-startstop.s b/lld/test/ELF/aarch64-memtag-startstop.s
new file mode 100644
index 000000000000000..a8bdcd8929d4879
--- /dev/null
+++ b/lld/test/ELF/aarch64-memtag-startstop.s
@@ -0,0 +1,48 @@
+# REQUIRES: aarch64
+# RUN: llvm-mc -filetype=obj -triple=aarch64-none-linux-android %s -o %t
+# RUN: ld.lld %t -o %t.so -shared --android-memtag-mode=sync
+
+## Normally relocations are printed before the symbol tables, so reorder it a
+## bit to make it easier on matching addresses of relocations up with the
+## symbols.
+# RUN: llvm-readelf %t.so -s > %t.out
+# RUN: llvm-readelf %t.so --section-headers --relocs --memtag >> %t.out
+# RUN: FileCheck %s < %t.out
+
+# CHECK: Symbol table '.dynsym' contains
+# CHECK-DAG: [[#%x,GLOBAL:]] 16 OBJECT GLOBAL DEFAULT [[#]] global{{$}}
+# CHECK-DAG: [[#%x,GLOBAL_IN_SECTION:]] 16 OBJECT GLOBAL DEFAULT [[#]] global_in_section{{$}}
+
+# CHECK: Section Headers:
+# CHECK: .memtag.globals.dynamic AARCH64_MEMTAG_GLOBALS_DYNAMIC
+# CHECK-NOT: .memtag.globals.static
+# CHECK-NOT: AARCH64_MEMTAG_GLOBALS_STATIC
+
+# CHECK: Memtag Dynamic Entries
+# CHECK-NEXT: AARCH64_MEMTAG_MODE: Synchronous (0)
+# CHECK-NEXT: AARCH64_MEMTAG_HEAP: Disabled (0)
+# CHECK-NEXT: AARCH64_MEMTAG_STACK: Disabled (0)
+# CHECK-NEXT: AARCH64_MEMTAG_GLOBALS: 0x{{[0-9a-f]+}}
+# CHECK-NEXT: AARCH64_MEMTAG_GLOBALSSZ: 3
+
+# CHECK: Memtag Global Descriptors:
+# CHECK-NEXT: 0x[[#GLOBAL]]: 0x10
+# CHECK-NOT: 0x
+
+ .memtag global
+ .type global, at object
+ .bss
+ .globl global
+ .p2align 4, 0x0
+global:
+ .zero 16
+ .size global, 16
+
+ .memtag global_in_section // @global_in_section
+ .type global_in_section, at object
+ .section metadata_strings,"aw", at progbits
+ .globl global_in_section
+ .p2align 4, 0x0
+global_in_section:
+ .zero 16
+ .size global_in_section, 16
>From 9e232b078213d8237e06c031defd9226f9263b7e Mon Sep 17 00:00:00 2001
From: Mitch Phillips <mitchp at google.com>
Date: Mon, 27 Nov 2023 16:49:09 +0100
Subject: [PATCH 2/2] clang-format
---
lld/ELF/Arch/AArch64.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 0dff31c6d0379ed..af44668871dd576 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -1007,7 +1007,7 @@ addTaggedSymbolReferences(InputSectionBase &sec,
// doesn't work under MTE globals, because each GV would have its own unique
// tag. So, symbols that are destined for a special section with start/stop
// symbols should go untagged implicitly.
- const Defined* defined_sym = dyn_cast<Defined>(&sym);
+ const Defined *defined_sym = dyn_cast<Defined>(&sym);
if (defined_sym && defined_sym->section &&
isValidCIdentifier(defined_sym->section->name))
continue;
More information about the llvm-commits
mailing list