[lld] [WIP][lld] Add lld warning for checking discarded sections via COMDAT (PR #74151)

via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 1 14:40:56 PST 2023


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-lld-elf

Author: None (PiJoules)

<details>
<summary>Changes</summary>

(Just an experiment, not sure if this is worth landing.)

The warning asserts that all sections that would be discarded from the rules for comdat groups have the same contents and sections as the ones retained in the final binary. Specifically, for a given comdat group signature, this checks:
- The number of sections in that group is the same for all comdat groups with that signature.
- The names of all sections in that group are the same for all other comdat groups with that signature.
- The contents of all sections with the same name in all groups for that signature are the same.

The idea here was to detect ODR violations at the link-level. On rare occasions we've run into instances where completely different implementations for a function or data structure win at link time which this might help catch. Note all instances caught by this may actually be indicative of bugs. Due to interprocedural optimizations, it can be entirely possible for the same function in different object files to have slightly different assembly. This wouldn't be practical for relocation sections as well. This comes with an exclude list for disallowing checks on those sections.

Practically, the most I can see this being used for is checking that comdat groups accross all object files have the same named sections in them, and that data sections in those groups have the same contents.

---
Full diff: https://github.com/llvm/llvm-project/pull/74151.diff


13 Files Affected:

- (modified) lld/ELF/Config.h (+3) 
- (modified) lld/ELF/Driver.cpp (+14) 
- (modified) lld/ELF/InputFiles.cpp (+87-4) 
- (modified) lld/ELF/InputFiles.h (+4) 
- (modified) lld/ELF/Options.td (+9) 
- (modified) lld/ELF/SymbolTable.h (+10) 
- (added) lld/test/ELF/mismatch-sections-in-comdat-group-warning/Inputs/func.s (+4) 
- (added) lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-group-sizes.s (+17) 
- (added) lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-section-contents.s (+12) 
- (added) lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-section-names.s (+12) 
- (added) lld/test/ELF/mismatch-sections-in-comdat-group-warning/disabled-warning.s (+13) 
- (added) lld/test/ELF/mismatch-sections-in-comdat-group-warning/ignore-section.s (+9) 
- (added) lld/test/ELF/mismatch-sections-in-comdat-group-warning/no-warning.s (+11) 


``````````diff
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index 56229334f9a44ae..ed0aacb8ae755e3 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -295,6 +295,9 @@ struct Config {
   bool useAndroidRelrTags = false;
   bool warnBackrefs;
   llvm::SmallVector<llvm::GlobPattern, 0> warnBackrefsExclude;
+  bool warnMismatchSectionsInComdatGroups;
+  llvm::SmallVector<llvm::GlobPattern, 0>
+      warnMismatchSectionsInComdatGroupsExclude;
   bool warnCommon;
   bool warnMissingEntry;
   bool warnSymbolOrdering;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 6bef09eeca015aa..0042ad555d43925 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1437,6 +1437,9 @@ static void readConfigs(opt::InputArgList &args) {
       OPT_use_android_relr_tags, OPT_no_use_android_relr_tags, false);
   config->warnBackrefs =
       args.hasFlag(OPT_warn_backrefs, OPT_no_warn_backrefs, false);
+  config->warnMismatchSectionsInComdatGroups =
+      args.hasFlag(OPT_warn_mismatch_sections_in_comdat_groups,
+                   OPT_no_warn_mismatch_sections_in_comdat_groups, false);
   config->warnCommon = args.hasFlag(OPT_warn_common, OPT_no_warn_common, false);
   config->warnSymbolOrdering =
       args.hasFlag(OPT_warn_symbol_ordering, OPT_no_warn_symbol_ordering, true);
@@ -1709,6 +1712,17 @@ static void readConfigs(opt::InputArgList &args) {
             pattern);
   }
 
+  for (opt::Arg *arg :
+       args.filtered(OPT_warn_mismatch_sections_in_comdat_groups_exclude)) {
+    StringRef pattern(arg->getValue());
+    if (Expected<GlobPattern> pat = GlobPattern::create(pattern))
+      config->warnMismatchSectionsInComdatGroupsExclude.push_back(
+          std::move(*pat));
+    else
+      error(arg->getSpelling() + ": " + toString(pat.takeError()) + ": " +
+            pattern);
+  }
+
   // For -no-pie and -pie, --export-dynamic-symbol specifies defined symbols
   // which should be exported. For -shared, references to matched non-local
   // STV_DEFAULT symbols are not bound to definitions within the shared object,
diff --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp
index 06a3d565deb765e..0cde39ef1fad655 100644
--- a/lld/ELF/InputFiles.cpp
+++ b/lld/ELF/InputFiles.cpp
@@ -561,6 +561,84 @@ uint32_t ObjFile<ELFT>::getSectionIndex(const Elf_Sym &sym) const {
       this);
 }
 
+template <class ELFT>
+void ObjFile<ELFT>::checkComdatGroups(bool recordedNewSection,
+                                      const InputFile *otherFile,
+                                      ArrayRef<Elf_Word> sectionsInGroup,
+                                      StringRef signature) {
+  ArrayRef<Elf_Shdr> objSections = this->getELFShdrs<ELFT>();
+  object::ELFFile<ELFT> obj = this->getObj();
+  StringRef shstrtab = CHECK(obj.getSectionStringTable(objSections), this);
+  const auto &exclude = config->warnMismatchSectionsInComdatGroupsExclude;
+
+  // First check for a given comdat group, we have all the necessary sections.
+  if (recordedNewSection) {
+    // The signature was recorded - we need to save the section names for this
+    // group.
+    for (uint32_t secIdx : sectionsInGroup) {
+      const Elf_Shdr &section = objSections[secIdx];
+      StringRef sectionName =
+          CHECK(obj.getSectionName(section, shstrtab), this);
+      symtab.comdatGroupSectionNames[CachedHashStringRef(signature)].insert(
+          sectionName);
+    }
+  } else {
+    // The signature was not recorded - this comdat group has a set of section
+    // names associated with it. Check the sets match.
+    const llvm::StringSet<> &cachedSectionNames =
+        symtab.comdatGroupSectionNames.at(CachedHashStringRef(signature));
+    size_t cachedSize = cachedSectionNames.size();
+    size_t foundSize = sectionsInGroup.size();
+    if (cachedSize != foundSize) {
+      warn("comdat group with signature '" + signature + "' in '" +
+           otherFile->getName() + "' has " + Twine(cachedSize) +
+           " section(s) while group in '" + this->getName() + "' has " +
+           Twine(foundSize) + " section(s)");
+      return;
+    }
+
+    // Group sizes match, but may have different section names.
+    for (uint32_t secIdx : sectionsInGroup) {
+      const Elf_Shdr &section = objSections[secIdx];
+      StringRef sectionName =
+          CHECK(obj.getSectionName(section, shstrtab), this);
+      if (!cachedSectionNames.contains(sectionName)) {
+        warn("comdat group with signature '" + signature + "' in '" +
+             otherFile->getName() + "' does not have section '" + sectionName +
+             "' which is part of comdat group in '" + this->getName() + "'");
+        return;
+      }
+    }
+  }
+
+  // Get all sections part of the group seen here.
+  for (uint32_t secIdx : sectionsInGroup) {
+    const Elf_Shdr &section = objSections[secIdx];
+    ArrayRef<uint8_t> contents =
+        CHECK(obj.template getSectionContentsAsArray<uint8_t>(section), this);
+    StringRef sectionName = CHECK(obj.getSectionName(section, shstrtab), this);
+
+    auto pair = symtab.comdatGroupSectionContents.try_emplace(
+        CachedHashStringRef(sectionName), contents);
+    bool added = pair.second;
+    ArrayRef<uint8_t> cachedContents = pair.first->second;
+
+    bool ignoreSection =
+        std::any_of(exclude.begin(), exclude.end(),
+                    [&sectionName](const llvm::GlobPattern &pat) {
+                      return pat.match(sectionName);
+                    });
+
+    if (!added && !ignoreSection && this != otherFile &&
+        !contents.equals(cachedContents)) {
+      warn("section '" + sectionName + "' for comdat group with signature '" +
+           signature + "' has different contents between '" + this->getName() +
+           "' and '" + otherFile->getName() + "'");
+      return;
+    }
+  }
+}
+
 template <class ELFT> void ObjFile<ELFT>::parse(bool ignoreComdats) {
   object::ELFFile<ELFT> obj = this->getObj();
   // Read a section table. justSymbols is usually false.
@@ -646,10 +724,15 @@ template <class ELFT> void ObjFile<ELFT>::parse(bool ignoreComdats) {
     if (flag && flag != GRP_COMDAT)
       fatal(toString(this) + ": unsupported SHT_GROUP format");
 
-    bool keepGroup =
-        (flag & GRP_COMDAT) == 0 || ignoreComdats ||
-        symtab.comdatGroups.try_emplace(CachedHashStringRef(signature), this)
-            .second;
+    bool keepGroup = (flag & GRP_COMDAT) == 0 || ignoreComdats;
+    if (!keepGroup) {
+      auto pair =
+          symtab.comdatGroups.try_emplace(CachedHashStringRef(signature), this);
+      keepGroup = pair.second;
+      if (config->warnMismatchSectionsInComdatGroups)
+        checkComdatGroups(pair.second, pair.first->second, entries.slice(1),
+                          signature);
+    }
     if (keepGroup) {
       if (config->relocatable)
         this->sections[i] = createInputSection(
diff --git a/lld/ELF/InputFiles.h b/lld/ELF/InputFiles.h
index ab98d78fcf1455a..a2ad394d4adf18f 100644
--- a/lld/ELF/InputFiles.h
+++ b/lld/ELF/InputFiles.h
@@ -304,6 +304,10 @@ template <class ELFT> class ObjFile : public ELFFileBase {
 
   bool shouldMerge(const Elf_Shdr &sec, StringRef name);
 
+  void checkComdatGroups(bool recordedNewSection, const InputFile *otherFile,
+                         ArrayRef<Elf_Word> sectionsInGroup,
+                         StringRef signature);
+
   // Each ELF symbol contains a section index which the symbol belongs to.
   // However, because the number of bits dedicated for that is limited, a
   // symbol can directly point to a section only when the section index is
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index c2c9cabc92a4da4..4fd6ac3735e513e 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -510,6 +510,15 @@ defm warn_backrefs_exclude
          "which should be ignored for --warn-backrefs.">,
       MetaVarName<"<glob>">;
 
+defm warn_mismatch_sections_in_comdat_groups: BB<"warn-mismatch-sections-in-comdat-groups",
+    "Warn about discarded sections in a comdat group that have different contents from a retained section in a comdat group with the same key",
+    "Do not warn about discarded sections in a comdat group that have different contents from a retained section in a comdat group with the same key">;
+
+defm warn_mismatch_sections_in_comdat_groups_exclude
+    : EEq<"warn-mismatch-sections-in-comdat-groups-exclude",
+         "Glob describing sections to ignore checking for --warn-mismatch-sections-in-comdat-groups.">,
+      MetaVarName<"<glob>">;
+
 defm warn_common: B<"warn-common",
     "Warn about duplicate common symbols",
     "Do not warn about duplicate common symbols (default)">;
diff --git a/lld/ELF/SymbolTable.h b/lld/ELF/SymbolTable.h
index 37e31d23729637b..1b26bae9c8f81c6 100644
--- a/lld/ELF/SymbolTable.h
+++ b/lld/ELF/SymbolTable.h
@@ -65,6 +65,16 @@ class SymbolTable {
   // is used to uniquify them.
   llvm::DenseMap<llvm::CachedHashStringRef, const InputFile *> comdatGroups;
 
+  // Map from a section name for a comdat section to its contents. These
+  // sections are the ones not discarded.
+  llvm::DenseMap<llvm::CachedHashStringRef, ArrayRef<uint8_t>>
+      comdatGroupSectionContents;
+
+  // Map from a comdat group signature to the names of sections part of the
+  // group.
+  llvm::DenseMap<llvm::CachedHashStringRef, llvm::StringSet<>>
+      comdatGroupSectionNames;
+
   // The Map of __acle_se_<sym>, <sym> pairs found in the input objects.
   // Key is the <sym> name.
   llvm::SmallMapVector<StringRef, ArmCmseEntryFunction, 1> cmseSymMap;
diff --git a/lld/test/ELF/mismatch-sections-in-comdat-group-warning/Inputs/func.s b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/Inputs/func.s
new file mode 100644
index 000000000000000..767d52f0f47a80f
--- /dev/null
+++ b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/Inputs/func.s
@@ -0,0 +1,4 @@
+  .global func
+  .section .data.func,"awG", at progbits,func,comdat
+func:
+  .word 7
diff --git a/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-group-sizes.s b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-group-sizes.s
new file mode 100644
index 000000000000000..2f39b5ff7e93779
--- /dev/null
+++ b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-group-sizes.s
@@ -0,0 +1,17 @@
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %p/Inputs/func.s -o %t0.o
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
+// RUN: ld.lld -shared %t0.o %t1.o -o /dev/null --warn-mismatch-sections-in-comdat-groups 2>&1 | FileCheck %s
+// RUN: ld.lld -shared %t1.o %t0.o -o /dev/null --warn-mismatch-sections-in-comdat-groups 2>&1 | FileCheck %s --check-prefix=REVERSE
+
+// CHECK:   warning: comdat group with signature 'func' in '{{.*}}0.o' has 1 section(s) while group in '{{.*}}1.o' has 2 section(s)
+// REVERSE: warning: comdat group with signature 'func' in '{{.*}}1.o' has 2 section(s) while group in '{{.*}}0.o' has 1 section(s)
+
+  .global func
+  .section .data.func,"awG", at progbits,func,comdat
+func:
+  .word 7
+
+  .global func2
+  .section .data.func2,"awG", at progbits,func,comdat
+func2:
+  .word 7
diff --git a/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-section-contents.s b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-section-contents.s
new file mode 100644
index 000000000000000..2313921021ecc8b
--- /dev/null
+++ b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-section-contents.s
@@ -0,0 +1,12 @@
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %p/Inputs/func.s -o %t0.o
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
+// RUN: ld.lld -shared %t0.o %t1.o -o /dev/null --warn-mismatch-sections-in-comdat-groups 2>&1 | FileCheck %s
+// RUN: ld.lld -shared %t1.o %t0.o -o /dev/null --warn-mismatch-sections-in-comdat-groups 2>&1 | FileCheck %s --check-prefix=REVERSE
+
+// CHECK:   warning: section '.data.func' for comdat group with signature 'func' has different contents between '{{.*}}1.o' and '{{.*}}0.o'
+// REVERSE: warning: section '.data.func' for comdat group with signature 'func' has different contents between '{{.*}}0.o' and '{{.*}}1.o'
+
+  .global func
+  .section .data.func,"awG", at progbits,func,comdat
+func:
+  .word 6
diff --git a/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-section-names.s b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-section-names.s
new file mode 100644
index 000000000000000..d21c33354dc9a3d
--- /dev/null
+++ b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/different-section-names.s
@@ -0,0 +1,12 @@
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %p/Inputs/func.s -o %t0.o
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
+// RUN: ld.lld -shared %t0.o %t1.o -o /dev/null --warn-mismatch-sections-in-comdat-groups 2>&1 | FileCheck %s
+// RUN: ld.lld -shared %t1.o %t0.o -o /dev/null --warn-mismatch-sections-in-comdat-groups 2>&1 | FileCheck %s --check-prefix=REVERSE
+
+// CHECK:   warning: comdat group with signature 'func' in '{{.*}}0.o' does not have section '.data.func2' which is part of comdat group in '{{.*}}1.o'
+// REVERSE: warning: comdat group with signature 'func' in '{{.*}}1.o' does not have section '.data.func' which is part of comdat group in '{{.*}}0.o'
+
+  .global func
+  .section .data.func2,"awG", at progbits,func,comdat
+func:
+  .word 7
diff --git a/lld/test/ELF/mismatch-sections-in-comdat-group-warning/disabled-warning.s b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/disabled-warning.s
new file mode 100644
index 000000000000000..09ebdc798ef694b
--- /dev/null
+++ b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/disabled-warning.s
@@ -0,0 +1,13 @@
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %p/Inputs/func.s -o %t0.o
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
+// RUN: ld.lld -shared %t0.o %t1.o -o /dev/null --fatal-warnings
+// RUN: ld.lld -shared %t1.o %t0.o -o /dev/null --fatal-warnings
+// RUN: ld.lld -shared %t0.o %t1.o -o /dev/null --fatal-warnings --no-warn-mismatch-sections-in-comdat-groups
+// RUN: ld.lld -shared %t1.o %t0.o -o /dev/null --fatal-warnings --no-warn-mismatch-sections-in-comdat-groups
+
+// No error since the warning is disabled, despite the mismatch.
+
+  .global func
+  .section .data.func,"awG", at progbits,func,comdat
+func:
+  .word 6
diff --git a/lld/test/ELF/mismatch-sections-in-comdat-group-warning/ignore-section.s b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/ignore-section.s
new file mode 100644
index 000000000000000..a5a59f447799dd5
--- /dev/null
+++ b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/ignore-section.s
@@ -0,0 +1,9 @@
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %p/Inputs/func.s -o %t0.o
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
+// RUN: ld.lld -shared %t0.o %t1.o -o /dev/null --fatal-warnings --warn-mismatch-sections-in-comdat-groups --warn-mismatch-sections-in-comdat-groups-exclude=.data.f*
+// RUN: ld.lld -shared %t1.o %t0.o -o /dev/null --fatal-warnings --warn-mismatch-sections-in-comdat-groups --warn-mismatch-sections-in-comdat-groups-exclude=.data.f*
+
+  .global func
+  .section .data.func,"awG", at progbits,func,comdat
+func:
+  .word 6
diff --git a/lld/test/ELF/mismatch-sections-in-comdat-group-warning/no-warning.s b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/no-warning.s
new file mode 100644
index 000000000000000..f80ea070e358ae9
--- /dev/null
+++ b/lld/test/ELF/mismatch-sections-in-comdat-group-warning/no-warning.s
@@ -0,0 +1,11 @@
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %p/Inputs/func.s -o %t0.o
+// RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
+// RUN: ld.lld -shared %t0.o %t1.o -o /dev/null --fatal-warnings --warn-mismatch-sections-in-comdat-groups
+// RUN: ld.lld -shared %t1.o %t0.o -o /dev/null --fatal-warnings --warn-mismatch-sections-in-comdat-groups
+
+// No error since the sections in the comdat groups for `func` in both object files are the same.
+
+  .global func
+  .section .data.func,"awG", at progbits,func,comdat
+func:
+  .word 7

``````````

</details>


https://github.com/llvm/llvm-project/pull/74151


More information about the llvm-commits mailing list