[lld] r329636 - Add --warn-backrefs to maintain compatibility with other linkers

Rui Ueyama via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 9 16:05:48 PDT 2018


Author: ruiu
Date: Mon Apr  9 16:05:48 2018
New Revision: 329636

URL: http://llvm.org/viewvc/llvm-project?rev=329636&view=rev
Log:
Add --warn-backrefs to maintain compatibility with other linkers

I'm proposing a new command line flag, --warn-backrefs in this patch.
The flag and the feature proposed below don't exist in GNU linkers
nor the current lld.

--warn-backrefs is an option to detect reverse or cyclic dependencies
between static archives, and it can be used to keep your program
compatible with GNU linkers after you switch to lld. I'll explain the
feature and why you may find it useful below.

lld's symbol resolution semantics is more relaxed than traditional
Unix linkers. Therefore,

  ld.lld foo.a bar.o

succeeds even if bar.o contains an undefined symbol that have to be
resolved by some object file in foo.a. Traditional Unix linkers
don't allow this kind of backward reference, as they visit each
file only once from left to right in the command line while
resolving all undefined symbol at the moment of visiting.

In the above case, since there's no undefined symbol when a linker
visits foo.a, no files are pulled out from foo.a, and because the
linker forgets about foo.a after visiting, it can't resolve
undefined symbols that could have been resolved otherwise.

That lld accepts more relaxed form means (besides it makes more
sense) that you can accidentally write a command line or a build
file that works only with lld, even if you have a plan to
distribute it to wider users who may be using GNU linkers.  With
--check-library-dependency, you can detect a library order that
doesn't work with other Unix linkers.

The option is also useful to detect cyclic dependencies between
static archives. Again, lld accepts

  ld.lld foo.a bar.a

even if foo.a and bar.a depend on each other. With --warn-backrefs
it is handled as an error.

Here is how the option works. We assign a group ID to each file. A
file with a smaller group ID can pull out object files from an
archive file with an equal or greater group ID. Otherwise, it is a
reverse dependency and an error.

A file outside --{start,end}-group gets a fresh ID when
instantiated. All files within the same --{start,end}-group get the
same group ID. E.g.

  ld.lld A B --start-group C D --end-group E

A and B form group 0, C, D and their member object files form group
1, and E forms group 2. I think that you can see how this group
assignment rule simulates the traditional linker's semantics.

Differential Revision: https://reviews.llvm.org/D45195

Added:
    lld/trunk/test/ELF/warn-backrefs.s
Modified:
    lld/trunk/ELF/Config.h
    lld/trunk/ELF/Driver.cpp
    lld/trunk/ELF/InputFiles.cpp
    lld/trunk/ELF/InputFiles.h
    lld/trunk/ELF/Options.td
    lld/trunk/ELF/ScriptParser.cpp
    lld/trunk/ELF/SymbolTable.cpp

Modified: lld/trunk/ELF/Config.h
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Config.h?rev=329636&r1=329635&r2=329636&view=diff
==============================================================================
--- lld/trunk/ELF/Config.h (original)
+++ lld/trunk/ELF/Config.h Mon Apr  9 16:05:48 2018
@@ -153,6 +153,7 @@ struct Configuration {
   bool Target1Rel;
   bool Trace;
   bool UndefinedVersion;
+  bool WarnBackrefs;
   bool WarnCommon;
   bool WarnMissingEntry;
   bool WarnSymbolOrdering;

Modified: lld/trunk/ELF/Driver.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Driver.cpp?rev=329636&r1=329635&r2=329636&view=diff
==============================================================================
--- lld/trunk/ELF/Driver.cpp (original)
+++ lld/trunk/ELF/Driver.cpp Mon Apr  9 16:05:48 2018
@@ -704,6 +704,8 @@ void LinkerDriver::readConfigs(opt::Inpu
   Config->UndefinedVersion =
       Args.hasFlag(OPT_undefined_version, OPT_no_undefined_version, true);
   Config->UnresolvedSymbols = getUnresolvedSymbolPolicy(Args);
+  Config->WarnBackrefs =
+      Args.hasFlag(OPT_warn_backrefs, OPT_no_warn_backrefs, false);
   Config->WarnCommon = Args.hasFlag(OPT_warn_common, OPT_no_warn_common, false);
   Config->WarnSymbolOrdering =
       Args.hasFlag(OPT_warn_symbol_ordering, OPT_no_warn_symbol_ordering, true);
@@ -937,6 +939,16 @@ void LinkerDriver::createFiles(opt::Inpu
         Files.back()->JustSymbols = true;
       }
       break;
+    case OPT_start_group:
+      if (InputFile::IsInGroup)
+        error("nested --start-group");
+      InputFile::IsInGroup = true;
+      break;
+    case OPT_end_group:
+      if (!InputFile::IsInGroup)
+        error("stray --end-group");
+      InputFile::IsInGroup = false;
+      break;
     case OPT_start_lib:
       InLib = true;
       break;

Modified: lld/trunk/ELF/InputFiles.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputFiles.cpp?rev=329636&r1=329635&r2=329636&view=diff
==============================================================================
--- lld/trunk/ELF/InputFiles.cpp (original)
+++ lld/trunk/ELF/InputFiles.cpp Mon Apr  9 16:05:48 2018
@@ -38,6 +38,8 @@ using namespace llvm::sys::fs;
 using namespace lld;
 using namespace lld::elf;
 
+bool InputFile::IsInGroup;
+uint32_t InputFile::NextGroupId;
 std::vector<BinaryFile *> elf::BinaryFiles;
 std::vector<BitcodeFile *> elf::BitcodeFiles;
 std::vector<InputFile *> elf::ObjectFiles;
@@ -45,7 +47,13 @@ std::vector<InputFile *> elf::SharedFile
 
 TarWriter *elf::Tar;
 
-InputFile::InputFile(Kind K, MemoryBufferRef M) : MB(M), FileKind(K) {}
+InputFile::InputFile(Kind K, MemoryBufferRef M)
+    : MB(M), GroupId(NextGroupId), FileKind(K) {
+  // All files within the same --{start,end}-group get the same group ID.
+  // Otherwise, a new file will get a new group ID.
+  if (!IsInGroup)
+    ++NextGroupId;
+}
 
 Optional<MemoryBufferRef> elf::readFile(StringRef Path) {
   // The --chroot option changes our virtual root directory.
@@ -760,8 +768,10 @@ InputFile *ArchiveFile::fetch(const Arch
   if (Tar && C.getParent()->isThin())
     Tar->append(relativeToRoot(CHECK(C.getFullName(), this)), MB.getBuffer());
 
-  return createObjectFile(MB, getName(),
-                          C.getParent()->isThin() ? 0 : C.getChildOffset());
+  InputFile *File = createObjectFile(
+      MB, getName(), C.getParent()->isThin() ? 0 : C.getChildOffset());
+  File->GroupId = GroupId;
+  return File;
 }
 
 template <class ELFT>
@@ -1168,7 +1178,10 @@ InputFile *LazyObjFile::fetch() {
   MemoryBufferRef MBRef = getBuffer();
   if (MBRef.getBuffer().empty())
     return nullptr;
-  return createObjectFile(MBRef, ArchiveName, OffsetInArchive);
+
+  InputFile *File = createObjectFile(MBRef, ArchiveName, OffsetInArchive);
+  File->GroupId = GroupId;
+  return File;
 }
 
 template <class ELFT> void LazyObjFile::parse() {

Modified: lld/trunk/ELF/InputFiles.h
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputFiles.h?rev=329636&r1=329635&r2=329636&view=diff
==============================================================================
--- lld/trunk/ELF/InputFiles.h (original)
+++ lld/trunk/ELF/InputFiles.h Mon Apr  9 16:05:48 2018
@@ -112,6 +112,13 @@ public:
   // True if this is an argument for --just-symbols. Usually false.
   bool JustSymbols = false;
 
+  // GroupId is used for --warn-backrefs which is an optional error
+  // checking feature. All files within the same --{start,end}-group
+  // get the same group ID. Otherwise, each file gets a new group
+  // ID. For more info, see checkDependency() in SymbolTable.cpp.
+  uint32_t GroupId;
+  static bool IsInGroup;
+
 protected:
   InputFile(Kind K, MemoryBufferRef M);
   std::vector<InputSectionBase *> Sections;
@@ -119,6 +126,8 @@ protected:
 
 private:
   const Kind FileKind;
+
+  static uint32_t NextGroupId;
 };
 
 template <typename ELFT> class ELFFileBase : public InputFile {

Modified: lld/trunk/ELF/Options.td
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Options.td?rev=329636&r1=329635&r2=329636&view=diff
==============================================================================
--- lld/trunk/ELF/Options.td (original)
+++ lld/trunk/ELF/Options.td Mon Apr  9 16:05:48 2018
@@ -113,6 +113,9 @@ def emit_relocs: F<"emit-relocs">, HelpT
 def enable_new_dtags: F<"enable-new-dtags">,
   HelpText<"Enable new dynamic tags">;
 
+def end_group: F<"end-group">,
+  HelpText<"Ignored for compatibility with GNU unless you pass --warn-backrefs">;
+
 def end_lib: F<"end-lib">,
   HelpText<"End a grouping of objects that should be treated as if they were together in an archive">;
 
@@ -274,6 +277,9 @@ defm soname: Eq<"soname">, HelpText<"Set
 defm sort_section: Eq<"sort-section">,
   HelpText<"Specifies sections sorting rule when linkerscript is used">;
 
+def start_group: F<"start-group">,
+  HelpText<"Ignored for compatibility with GNU unless you pass --warn-backrefs">;
+
 def start_lib: F<"start-lib">,
   HelpText<"Start a grouping of objects that should be treated as if they were together in an archive">;
 
@@ -323,6 +329,10 @@ def version: F<"version">, HelpText<"Dis
 
 defm version_script: Eq<"version-script">, HelpText<"Read a version script">;
 
+defm warn_backrefs: B<"warn-backrefs",
+    "Warn about backward symbol references to fetch archive members",
+    "Do not warn about backward symbol references to fetch archive members">;
+
 defm warn_common: B<"warn-common",
     "Warn about duplicate common symbols",
     "Do not warn about duplicate common symbols">;
@@ -357,6 +367,7 @@ def alias_define_common_dp: F<"dp">, Ali
 def alias_discard_all_x: Flag<["-"], "x">, Alias<discard_all>;
 def alias_discard_locals_X: Flag<["-"], "X">, Alias<discard_locals>;
 def alias_emit_relocs: Flag<["-"], "q">, Alias<emit_relocs>;
+def alias_end_group_paren: Flag<["-"], ")">, Alias<end_group>;
 def alias_entry_e: JoinedOrSeparate<["-"], "e">, Alias<entry>;
 def alias_export_dynamic_E: Flag<["-"], "E">, Alias<export_dynamic>;
 def alias_filter: Separate<["-"], "F">, Alias<filter>;
@@ -374,6 +385,7 @@ def alias_rpath_R: JoinedOrSeparate<["-"
 def alias_script_T: JoinedOrSeparate<["-"], "T">, Alias<script>;
 def alias_shared_Bshareable: F<"Bshareable">, Alias<shared>;
 def alias_soname_h: JoinedOrSeparate<["-"], "h">, Alias<soname>;
+def alias_start_group_paren: Flag<["-"], "(">, Alias<start_group>;
 def alias_strip_all: Flag<["-"], "s">, Alias<strip_all>;
 def alias_strip_debug_S: Flag<["-"], "S">, Alias<strip_debug>;
 def alias_trace: Flag<["-"], "t">, Alias<trace>;
@@ -383,14 +395,6 @@ def alias_Ttext_segment_eq: Joined<["-",
 def alias_undefined_u: JoinedOrSeparate<["-"], "u">, Alias<undefined>;
 def alias_version_V: Flag<["-"], "V">, Alias<version>;
 
-// Our symbol resolution algorithm handles symbols in archive files differently
-// than traditional linkers, so we don't need --start-group and --end-group.
-// These options are recongized for compatibility but ignored.
-def end_group: F<"end-group">;
-def end_group_paren: Flag<["-"], ")">;
-def start_group: F<"start-group">;
-def start_group_paren: Flag<["-"], "(">;
-
 // LTO-related options.
 def lto_aa_pipeline: J<"lto-aa-pipeline=">,
   HelpText<"AA pipeline to run during LTO. Used in conjunction with -lto-newpm-passes">;

Modified: lld/trunk/ELF/ScriptParser.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/ScriptParser.cpp?rev=329636&r1=329635&r2=329636&view=diff
==============================================================================
--- lld/trunk/ELF/ScriptParser.cpp (original)
+++ lld/trunk/ELF/ScriptParser.cpp Mon Apr  9 16:05:48 2018
@@ -63,6 +63,7 @@ private:
   void readExtern();
   void readGroup();
   void readInclude();
+  void readInput();
   void readMemory();
   void readOutput();
   void readOutputArch();
@@ -233,10 +234,12 @@ void ScriptParser::readLinkerScript() {
       readEntry();
     } else if (Tok == "EXTERN") {
       readExtern();
-    } else if (Tok == "GROUP" || Tok == "INPUT") {
+    } else if (Tok == "GROUP") {
       readGroup();
     } else if (Tok == "INCLUDE") {
       readInclude();
+    } else if (Tok == "INPUT") {
+      readInput();
     } else if (Tok == "MEMORY") {
       readMemory();
     } else if (Tok == "OUTPUT") {
@@ -326,13 +329,10 @@ void ScriptParser::readExtern() {
 }
 
 void ScriptParser::readGroup() {
-  expect("(");
-  while (!errorCount() && !consume(")")) {
-    if (consume("AS_NEEDED"))
-      readAsNeeded();
-    else
-      addFile(unquote(next()));
-  }
+  bool Orig = InputFile::IsInGroup;
+  InputFile::IsInGroup = true;
+  readInput();
+  InputFile::IsInGroup = Orig;
 }
 
 void ScriptParser::readInclude() {
@@ -351,6 +351,16 @@ void ScriptParser::readInclude() {
   setError("cannot find linker script " + Tok);
 }
 
+void ScriptParser::readInput() {
+  expect("(");
+  while (!errorCount() && !consume(")")) {
+    if (consume("AS_NEEDED"))
+      readAsNeeded();
+    else
+      addFile(unquote(next()));
+  }
+}
+
 void ScriptParser::readOutput() {
   // -o <file> takes predecence over OUTPUT(<file>).
   expect("(");

Modified: lld/trunk/ELF/SymbolTable.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/SymbolTable.cpp?rev=329636&r1=329635&r2=329636&view=diff
==============================================================================
--- lld/trunk/ELF/SymbolTable.cpp (original)
+++ lld/trunk/ELF/SymbolTable.cpp Mon Apr  9 16:05:48 2018
@@ -287,6 +287,63 @@ template <class ELFT> Symbol *SymbolTabl
 
 static uint8_t getVisibility(uint8_t StOther) { return StOther & 3; }
 
+// Do extra check for --warn-backrefs.
+//
+// --warn-backrefs is an option to prevent an undefined reference from
+// fetching an archive member written earlier in the command line. It can be
+// used to keep your program compatible with GNU linkers after you switch to
+// lld. I'll explain the feature and why you may find it useful in this
+// comment.
+//
+// lld's symbol resolution semantics is more relaxed than traditional Unix
+// linkers. For example,
+//
+//   ld.lld foo.a bar.o
+//
+// succeeds even if bar.o contains an undefined symbol that have to be
+// resolved by some object file in foo.a. Traditional Unix linkers don't
+// allow this kind of backward reference, as they visit each file only once
+// from left to right in the command line while resolving all undefined
+// symbols at the moment of visiting.
+//
+// In the above case, since there's no undefined symbol when a linker visits
+// foo.a, no files are pulled out from foo.a, and because the linker forgets
+// about foo.a after visiting, it can't resolve undefined symbols in bar.o
+// that could have been resolved otherwise.
+//
+// That lld accepts more relaxed form means that (besides it'd make more
+// sense) you can accidentally write a command line or a build file that
+// works only with lld, even if you have a plan to distribute it to wider
+// users who may be using GNU linkers. With --warn-backrefs, you can detect
+// a library order that doesn't work with other Unix linkers.
+//
+// The option is also useful to detect cyclic dependencies between static
+// archives. Again, lld accepts
+//
+//   ld.lld foo.a bar.a
+//
+// even if foo.a and bar.a depend on each other. With --warn-backrefs, it is
+// handled as an error.
+//
+// Here is how the option works. We assign a group ID to each file. A file
+// with a smaller group ID can pull out object files from an archive file
+// with an equal or greater group ID. Otherwise, it is a reverse dependency
+// and an error.
+//
+// A file outside --{start,end}-group gets a fresh ID when instantiated. All
+// files within the same --{start,end}-group get the same group ID. E.g.
+//
+//   ld.lld A B --start-group C D --end-group E
+//
+// A forms group 0. B form group 1. C and D (including their member object
+// files) form group 2. E forms group 3. I think that you can see how this
+// group assignment rule simulates the traditional linker's semantics.
+static void checkBackrefs(StringRef Name, InputFile *Old, InputFile *New) {
+  if (Config->WarnBackrefs && Old && New->GroupId < Old->GroupId)
+    warn("backward reference detected: " + Name + " in " + toString(Old) +
+         " refers to " + toString(New));
+}
+
 template <class ELFT>
 Symbol *SymbolTable::addUndefined(StringRef Name, uint8_t Binding,
                                   uint8_t StOther, uint8_t Type,
@@ -314,10 +371,13 @@ Symbol *SymbolTable::addUndefined(String
   if (S->isLazy()) {
     // An undefined weak will not fetch archive members. See comment on Lazy in
     // Symbols.h for the details.
-    if (Binding == STB_WEAK)
+    if (Binding == STB_WEAK) {
       S->Type = Type;
-    else
-      fetchLazy<ELFT>(S);
+      return S;
+    }
+
+    checkBackrefs(Name, File, S->File);
+    fetchLazy<ELFT>(S);
   }
   return S;
 }

Added: lld/trunk/test/ELF/warn-backrefs.s
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/warn-backrefs.s?rev=329636&view=auto
==============================================================================
--- lld/trunk/test/ELF/warn-backrefs.s (added)
+++ lld/trunk/test/ELF/warn-backrefs.s Mon Apr  9 16:05:48 2018
@@ -0,0 +1,32 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t1.o
+# RUN: echo ".globl foo; foo:" | llvm-mc -filetype=obj -triple=x86_64-unknown-linux - -o %t2.o
+# RUN: rm -f %t2.a
+# RUN: llvm-ar rcs %t2.a %t2.o
+
+# RUN: ld.lld --fatal-warnings -o %t.exe %t1.o %t2.a
+# RUN: ld.lld --fatal-warnings -o %t.exe %t2.a %t1.o
+# RUN: ld.lld --fatal-warnings --warn-backrefs -o %t.exe %t1.o %t2.a
+# RUN: ld.lld --fatal-warnings --warn-backrefs -o %t.exe %t1.o --start-lib %t2.o --end-lib
+
+# RUN: ld.lld --fatal-warnings --warn-backrefs -o %t.exe --start-group %t2.a %t1.o --end-group
+# RUN: ld.lld --fatal-warnings --warn-backrefs -o %t.exe "-(" %t2.a %t1.o "-)"
+
+# RUN: echo "INPUT(\"%t1.o\" \"%t2.a\")" > %t1.script
+# RUN: ld.lld --fatal-warnings --warn-backrefs -o %t.exe %t1.script
+
+# RUN: echo "GROUP(\"%t2.a\" \"%t1.o\")" > %t2.script
+# RUN: ld.lld --fatal-warnings --warn-backrefs -o %t.exe %t2.script
+
+# RUN: not ld.lld --fatal-warnings --warn-backrefs -o %t.exe %t2.a %t1.o 2>&1 | FileCheck %s
+# RUN: not ld.lld --fatal-warnings --warn-backrefs -o %t.exe %t2.a "-(" %t1.o "-)" 2>&1 | FileCheck %s
+
+# CHECK: backward reference detected: foo in {{.*}}1.o refers to {{.*}}2.a
+
+# RUN: not ld.lld --fatal-warnings --end-group 2>&1 | FileCheck -check-prefix=ENDGROUP %s
+# ENDGROUP: stray --end-group
+
+.globl _start, foo
+_start:
+  call foo




More information about the llvm-commits mailing list