[lld] [lld][ELF] Introduce an option to keep data section prefix. (PR #148985)

Mingming Liu via llvm-commits llvm-commits at lists.llvm.org
Sun Jul 27 16:58:39 PDT 2025


================
@@ -103,13 +109,42 @@ StringRef LinkerScript::getOutputSectionName(const InputSectionBase *s) const {
     return ".text";
   }
 
-  for (StringRef v : {".data.rel.ro", ".data",       ".rodata",
-                      ".bss.rel.ro",  ".bss",        ".ldata",
-                      ".lrodata",     ".lbss",       ".gcc_except_table",
-                      ".init_array",  ".fini_array", ".tbss",
-                      ".tdata",       ".ARM.exidx",  ".ARM.extab",
-                      ".ctors",       ".dtors",      ".sbss",
-                      ".sdata",       ".srodata"})
+  // When zKeepDataSectionPrefix is true, keep .hot and .unlikely suffixes
+  // in data sections.
+  static constexpr StringRef dataSectionPrefixes[] = {
+      ".data.rel.ro", ".data", ".rodata", ".bss.rel.ro", ".bss",
+  };
+
+  for (auto [index, v] : llvm::enumerate(dataSectionPrefixes)) {
+    StringRef secName = s->name;
+    if (trimSectionPrefix(v, secName)) {
+      if (!ctx.arg.zKeepDataSectionPrefix)
+        return v;
+      if (isSectionPrefix(".hot", secName))
+        return s->name.substr(0, v.size() + 4);
+      if (isSectionPrefix(".unlikely", secName))
+        return s->name.substr(0, v.size() + 9);
+      // For .rodata,  a section could be`.rodata.cst<N>.hot.` for constant
+      // pool or  `rodata.str<N>.hot.` for string literals.
+      if (index == 2) {
+        // The reason to specialize this path is to spell out .rodata.hot and
----------------
mingmingl-llvm wrote:

> The comment is too verbose ... making it easy for users to understand its semantics without adding comments that redundantly explain the code

done.

> Now I am unsure why we want to support .rodata.hot while we don't support .rodata.cst4.hot.

If I understand correctly, the question is about why having [1] when `.rodata.cst4.hot` doesn't end with a trailing dot. 

Basically, [1] is meant to to group both string literals (with name pattern `.rodata.str1.1.unlikely.` [2])  and constant pools (with name pattern `.rodata.cst<N>.hot` [3]). As the test case examples show, string literal section names have a trailing dot but the constant pool section names don't. 

Based on https://github.com/llvm/llvm-project/pull/148985#discussion_r2221141273, we need trailing dot to disambiguate between `.rodata.<var-name>` and `.rodata.<hotness>` when `<var-name>` is `hot` (something like https://gcc.godbolt.org/z/95f4oafcn).  I prepared https://github.com/llvm/llvm-project/pull/150859 to add trailing dot for constant pools when there is `.hot` or `.unlikely` in the section names. Without PR 150859, only string literals will be grouped with [1].


[1]
```
if (index == 2) {
      // Place input .rodata.str<N>.hot. or .rodata.cst<N>.hot. into the
      // .rodata.hot section.
      if (s->name.ends_with(".hot."))
        return ".rodata.hot";
      // Place input .rodata.str<N>.hot. or .rodata.cst<N>.unlikely. into the
      // .rodata.unlikely section.
      if (s->name.ends_with(".unlikely."))
        return ".rodata.unlikely";
    }
}
```
[2] https://github.com/llvm/llvm-project/blob/314e22bcab2b0f3d208708431a14215058f0718f/llvm/test/CodeGen/X86/global-variable-partition.ll#L60-L66

[3] https://github.com/llvm/llvm-project/blob/314e22bcab2b0f3d208708431a14215058f0718f/llvm/test/CodeGen/X86/constant-pool-partition.ll#L27-L31


https://github.com/llvm/llvm-project/pull/148985


More information about the llvm-commits mailing list