[llvm] [X86] Don't respect large data threshold for globals with an explicit section (PR #78348)

Arthur Eubanks via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 16 14:48:48 PST 2024


https://github.com/aeubanks updated https://github.com/llvm/llvm-project/pull/78348

>From fb3d647583c0c7c79b7121a88762d52bfeb7306d Mon Sep 17 00:00:00 2001
From: Arthur Eubanks <aeubanks at google.com>
Date: Tue, 16 Jan 2024 20:32:16 +0000
Subject: [PATCH 1/2] [X86] Don't respect large data threshold for globals with
 an explicit section

If multiple globals are placed in an explicit section, there's a chance
that the large data threshold will cause the different globals to be
inconsistent in whether they're large or small. Mixing sections with
mismatched large section flags can cause undesirable issues like
increased relocation pressure because there may be 32-bit references to
the section in some TUs, but the section is considered large since input
section flags are unioned and other TUs added the large section flag.

An explicit code model on the global still overrides the decision. We
can do this for globals without any references to them, like what we did
with asan_globals in #74514. If we have some precompiled small code
model files where asan_globals is not considered large mixed with
medium/large code model files, that's ok because the section is
considered large and placed farther. However, overriding the code model
for globals in some TUs but not others and having references to them
from code will still result in the above undesired behavior.

This mitigates a whole class of mismatched large section flag issues
like what #77986 was trying to fix.

This ends up not adding the SHF_X86_64_LARGE section flag on explicit
sections in the medium/large code model. This is ok for the large code
model since all references from large text must use 64-bit relocations
anyway.
---
 llvm/lib/Target/TargetMachine.cpp                | 10 ++++++++--
 llvm/test/CodeGen/X86/code-model-elf-sections.ll | 12 ++++++------
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/llvm/lib/Target/TargetMachine.cpp b/llvm/lib/Target/TargetMachine.cpp
index 2a4383314e4656..48a99dca0a6c9b 100644
--- a/llvm/lib/Target/TargetMachine.cpp
+++ b/llvm/lib/Target/TargetMachine.cpp
@@ -83,8 +83,14 @@ bool TargetMachine::isLargeGlobalValue(const GlobalValue *GVal) const {
       return true;
   }
 
-  if (getCodeModel() == CodeModel::Medium ||
-      getCodeModel() == CodeModel::Large) {
+  // Respect large data threshold for medium and large code models.
+  // ... But only for globals without an explicit section. If multiple globals
+  // are placed in an explicit section, there's a good chance that the data
+  // threshold will cause the different globals to be inconsistent in whether
+  // they're large or small. Mixing large section flags can cause undesirable
+  // issues like increased relocation pressure.
+  if (!GV->hasSection() && (getCodeModel() == CodeModel::Medium ||
+                            getCodeModel() == CodeModel::Large)) {
     if (!GV->getValueType()->isSized())
       return true;
     const DataLayout &DL = GV->getParent()->getDataLayout();
diff --git a/llvm/test/CodeGen/X86/code-model-elf-sections.ll b/llvm/test/CodeGen/X86/code-model-elf-sections.ll
index 749d5b6bf904e5..fc5dac89a6b3d3 100644
--- a/llvm/test/CodeGen/X86/code-model-elf-sections.ll
+++ b/llvm/test/CodeGen/X86/code-model-elf-sections.ll
@@ -55,13 +55,13 @@
 
 ; LARGE: .data {{.*}} WA {{.*}}
 ; LARGE: .data.x {{.*}} WA {{.*}}
-; LARGE: .data0 {{.*}} WAl {{.*}}
+; LARGE: .data0 {{.*}} WA {{.*}}
 ; LARGE: .ldata {{.*}} WAl {{.*}}
 ; LARGE: .ldata.x {{.*}} WAl {{.*}}
-; LARGE: .ldata0 {{.*}} WAl {{.*}}
+; LARGE: .ldata0 {{.*}} WA {{.*}}
 ; LARGE: force_small {{.*}} WA {{.*}}
 ; LARGE: force_large {{.*}} WAl {{.*}}
-; LARGE: foo {{.*}} WAl {{.*}}
+; LARGE: foo {{.*}} WA {{.*}}
 ; LARGE: .bss {{.*}} WA {{.*}}
 ; LARGE: .lbss {{.*}} WAl {{.*}}
 ; LARGE: .rodata {{.*}} A {{.*}}
@@ -72,14 +72,14 @@
 
 ; LARGE-DS: .data {{.*}} WA {{.*}}
 ; LARGE-DS: .data.x {{.*}} WA {{.*}}
-; LARGE-DS: .data0 {{.*}} WAl {{.*}}
+; LARGE-DS: .data0 {{.*}} WA {{.*}}
 ; LARGE-DS: .ldata {{.*}} WAl {{.*}}
 ; LARGE-DS: .ldata.x {{.*}} WAl {{.*}}
-; LARGE-DS: .ldata0 {{.*}} WAl {{.*}}
+; LARGE-DS: .ldata0 {{.*}} WA {{.*}}
 ; LARGE-DS: .ldata.data {{.*}} WAl {{.*}}
 ; LARGE-DS: force_small {{.*}} WA {{.*}}
 ; LARGE-DS: force_large {{.*}} WAl {{.*}}
-; LARGE-DS: foo {{.*}} WAl {{.*}}
+; LARGE-DS: foo {{.*}} WA {{.*}}
 ; LARGE-DS: .bss {{.*}} WA {{.*}}
 ; LARGE-DS: .lbss.bss {{.*}} WAl {{.*}}
 ; LARGE-DS: .rodata {{.*}} A {{.*}}

>From 6812b0a145824f4984b34a8c70b2d81f77ba5648 Mon Sep 17 00:00:00 2001
From: Arthur Eubanks <aeubanks at google.com>
Date: Tue, 16 Jan 2024 22:48:33 +0000
Subject: [PATCH 2/2] refactor

---
 llvm/lib/Target/TargetMachine.cpp | 40 +++++++++++++------------------
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/llvm/lib/Target/TargetMachine.cpp b/llvm/lib/Target/TargetMachine.cpp
index 48a99dca0a6c9b..a518c06819a98d 100644
--- a/llvm/lib/Target/TargetMachine.cpp
+++ b/llvm/lib/Target/TargetMachine.cpp
@@ -58,24 +58,8 @@ bool TargetMachine::isLargeGlobalValue(const GlobalValue *GVal) const {
   if (GV->isThreadLocal())
     return false;
 
-  // We should properly mark well-known section name prefixes as small/large,
-  // because otherwise the output section may have the wrong section flags and
-  // the linker will lay it out in an unexpected way.
-  StringRef Name = GV->getSection();
-  if (!Name.empty()) {
-    auto IsPrefix = [&](StringRef Prefix) {
-      StringRef S = Name;
-      return S.consume_front(Prefix) && (S.empty() || S[0] == '.');
-    };
-    if (IsPrefix(".bss") || IsPrefix(".data") || IsPrefix(".rodata"))
-      return false;
-    if (IsPrefix(".lbss") || IsPrefix(".ldata") || IsPrefix(".lrodata"))
-      return true;
-  }
-
   // For x86-64, we treat an explicit GlobalVariable small code model to mean
   // that the global should be placed in a small section, and ditto for large.
-  // Well-known section names above take precedence for correctness.
   if (auto CM = GV->getCodeModel()) {
     if (*CM == CodeModel::Small)
       return false;
@@ -83,14 +67,24 @@ bool TargetMachine::isLargeGlobalValue(const GlobalValue *GVal) const {
       return true;
   }
 
+  // Treat all globals in explicit sections as small, except for the standard
+  // large sections of .lbss, .ldata, .lrodata. This reduces the risk of linking
+  // together small and large sections, resulting in small references to large
+  // data sections. The code model attribute overrides this above.
+  if (GV->hasSection()) {
+    StringRef Name = GV->getSection();
+    auto IsPrefix = [&](StringRef Prefix) {
+      StringRef S = Name;
+      return S.consume_front(Prefix) && (S.empty() || S[0] == '.');
+    };
+    if (IsPrefix(".lbss") || IsPrefix(".ldata") || IsPrefix(".lrodata"))
+      return true;
+    return false;
+  }
+
   // Respect large data threshold for medium and large code models.
-  // ... But only for globals without an explicit section. If multiple globals
-  // are placed in an explicit section, there's a good chance that the data
-  // threshold will cause the different globals to be inconsistent in whether
-  // they're large or small. Mixing large section flags can cause undesirable
-  // issues like increased relocation pressure.
-  if (!GV->hasSection() && (getCodeModel() == CodeModel::Medium ||
-                            getCodeModel() == CodeModel::Large)) {
+  if (getCodeModel() == CodeModel::Medium ||
+      getCodeModel() == CodeModel::Large) {
     if (!GV->getValueType()->isSized())
       return true;
     const DataLayout &DL = GV->getParent()->getDataLayout();



More information about the llvm-commits mailing list