[llvm] [BOLT] Fix another cause of extra entry point misidentification (PR #155055)
YongKang Zhu via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 22 17:51:10 PDT 2025
https://github.com/yozhu created https://github.com/llvm/llvm-project/pull/155055
This time it is inline assembly code and function symbol does have non-zero size in symbol table. The function contains a constant island referenced by another function, which is discovered during relocation processing.
Also added an assertion in `addEntryPointAtOffset()` that extra entry point candidate does not point to constant data.
>From 93224bc79ace7c64e63085c2fd6087abcecb13b8 Mon Sep 17 00:00:00 2001
From: YongKang Zhu <yongzhu at fb.com>
Date: Thu, 21 Aug 2025 21:42:49 -0700
Subject: [PATCH] [BOLT] Fix another cause of extra entry point
misidentification
This time it is inline assembly code and function symbol does have
non-zero size in symbol table. The function has a constant island
that is referenced by another function, which is discovered during
relocation processing.
Also added an assertion in `addEntryPointAtOffset()` that extra entry
point candidate does not point to constant data.
---
bolt/lib/Core/BinaryFunction.cpp | 2 +
bolt/lib/Rewrite/RewriteInstance.cpp | 3 +-
.../AArch64/validate-secondary-entry-point.s | 44 +++++++++++++++++--
3 files changed, 45 insertions(+), 4 deletions(-)
diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 8f494f105fbba..4b261ef7e6d7a 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -3771,6 +3771,8 @@ MCSymbol *BinaryFunction::addEntryPointAtOffset(uint64_t Offset) {
assert(Offset && "cannot add primary entry point");
const uint64_t EntryPointAddress = getAddress() + Offset;
+ assert(!isInConstantIsland(EntryPointAddress) &&
+ "cannot add entry point that points to constant data");
MCSymbol *LocalSymbol = getOrCreateLocalLabel(EntryPointAddress);
MCSymbol *EntrySymbol = getSecondaryEntryPointSymbol(LocalSymbol);
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 4f5a75b770ce6..a6e4dbc9c192f 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -2935,7 +2935,8 @@ void RewriteInstance::handleRelocation(const SectionRef &RelocatedSection,
ReferencedSymbol = nullptr;
ExtractedValue = Address;
} else if (RefFunctionOffset) {
- if (ContainingBF && ContainingBF != ReferencedBF) {
+ if (ContainingBF && ContainingBF != ReferencedBF &&
+ !ReferencedBF->isInConstantIsland(Address)) {
ReferencedSymbol =
ReferencedBF->addEntryPointAtOffset(RefFunctionOffset);
} else {
diff --git a/bolt/test/AArch64/validate-secondary-entry-point.s b/bolt/test/AArch64/validate-secondary-entry-point.s
index 0099a0ee4fe99..3ad69467fe54f 100644
--- a/bolt/test/AArch64/validate-secondary-entry-point.s
+++ b/bolt/test/AArch64/validate-secondary-entry-point.s
@@ -1,13 +1,23 @@
# This test is to verify that BOLT won't take a label pointing to constant
-# island as a secondary entry point (function `_start` doesn't have ELF size
-# set originally) and the function won't otherwise be mistaken as non-simple.
+# island as a secondary entry point. This could happen when function doesn't
+# have ELF size set if it is from assembly code, or a constant island is
+# referenced by another function discovered during relocation processing.
-# RUN: %clang %cflags -pie %s -o %t.so -Wl,-q -Wl,--init=_foo -Wl,--fini=_foo
+# RUN: split-file %s %t
+
+# RUN: %clang %cflags -pie %t/tt.asm -o %t.so \
+# RUN: -Wl,-q -Wl,--init=_foo -Wl,--fini=_foo
# RUN: llvm-bolt %t.so -o %t.bolt.so --print-cfg 2>&1 | FileCheck %s
# CHECK-NOT: BOLT-WARNING: reference in the middle of instruction detected \
# CHECK-NOT: function _start at offset 0x{{[0-9a-f]+}}
# CHECK: Binary Function "_start" after building cfg
+# RUN: %clang %cflags -ffunction-sections -shared %t/tt.c %t/ss.c -o %tt.so \
+# RUN: -Wl,-q -Wl,--init=_start -Wl,--fini=_start \
+# RUN: -Wl,--version-script=%t/linker_script
+# RUN: llvm-bolt %tt.so -o %tt.bolted.so
+
+;--- tt.asm
.text
.global _foo
@@ -32,3 +42,31 @@ _bar:
# Dummy relocation to force relocation mode
.reloc 0, R_AARCH64_NONE
+
+;--- tt.c
+void _start() {}
+
+__attribute__((naked)) void foo() {
+ asm("ldr x16, .L_fnptr\n"
+ "blr x16\n"
+ "ret\n"
+
+ "_rodatx:"
+ ".global _rodatx;"
+ ".quad 0;"
+ ".L_fnptr:"
+ ".quad 0;");
+}
+
+;--- ss.c
+__attribute__((visibility("hidden"))) extern void* _rodatx;
+void* bar() { return &_rodatx; }
+
+;--- linker_script
+{
+global:
+ _start;
+ foo;
+ bar;
+local: *;
+};
More information about the llvm-commits
mailing list