[PATCH] D146215: [BOLT] Reject symbols pointing to section end

Job Noorman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 16 04:15:14 PDT 2023


jobnoorman created this revision.
jobnoorman added reviewers: yota9, maksfb, rafauler, Amir.
Herald added subscribers: asb, treapster, pmatos, ayermolo.
Herald added a project: All.
jobnoorman requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Sometimes, symbols are present that point to the end of a section (i.e.,
one-past the highest valid address). Currently, BOLT either rejects
those symbols when they don't point to another existing section, or errs
when they do and the other section is not executable. I suppose BOLT
would accept the symbol when it points to an executable section.

In any case, these symbols should not be considered while discovering
functions and should not result in an error. This patch implements that.

Note that this patch checks explicitly for symbols whose value equals
the end of their section. It might make more sense to verify that the
symbol's value is within [section start, section end). However, I'm not
sure if this could every happen *and* its value does not equal the end.

Another way to implement this is to verify that the BinarySection we
find at the symbol's address actually corresponds to the symbol's
section. I'm not sure what the best approach is so feedback is welcome.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D146215

Files:
  bolt/lib/Rewrite/RewriteInstance.cpp
  bolt/test/X86/section-end-sym.s


Index: bolt/test/X86/section-end-sym.s
===================================================================
--- /dev/null
+++ bolt/test/X86/section-end-sym.s
@@ -0,0 +1,29 @@
+## Check that BOLT doesn't consider end-of-section symbols (e.g., _etext) as
+## functions.
+
+# REQUIRES: system-linux
+
+# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-linux %s -o %t.o
+# RUN: ld.lld %t.o -o %t.exe -q
+# RUN: llvm-bolt %t.exe -o /dev/null --print-cfg --debug-only=bolt 2>&1 \
+# RUN:   | FileCheck %s
+
+# CHECK: considering symbol etext for function
+# CHECK-NEXT: rejecting as symbol points to end of its section
+# CHECK-NOT: Binary Function "etext{{.*}}" after building cfg
+
+
+  .text
+  .globl _start
+  .type _start, at function
+_start:
+  retq
+  .size _start, .-_start
+
+  .align 0x1000
+  .globl etext
+etext:
+
+  .data
+.Lfoo:
+  .word 0
Index: bolt/lib/Rewrite/RewriteInstance.cpp
===================================================================
--- bolt/lib/Rewrite/RewriteInstance.cpp
+++ bolt/lib/Rewrite/RewriteInstance.cpp
@@ -1058,6 +1058,16 @@
       continue;
     }
 
+    if (Address == Section->getAddress() + Section->getSize()) {
+      assert(SymbolSize == 0 &&
+             "unexpect non-zero sized symbol at end of section");
+      LLVM_DEBUG(
+          dbgs()
+          << "BOLT-DEBUG: rejecting as symbol points to end of its section\n");
+      registerName(SymbolSize);
+      continue;
+    }
+
     // Assembly functions could be ST_NONE with 0 size. Check that the
     // corresponding section is a code section and they are not inside any
     // other known function to consider them.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D146215.505765.patch
Type: text/x-patch
Size: 1630 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230316/d74e0f44/attachment.bin>


More information about the llvm-commits mailing list