[PATCH] llvm-link: Add -override flag to prefer duplicate symbols from one module.

Duncan P. N. Exon Smith dexonsmith at apple.com
Tue Apr 21 15:42:42 PDT 2015


> On 2015-Apr-20, at 15:07, Luqman Aden <me+llvm at luqman.ca> wrote:
> 
> 
> 
> On Mon, Apr 20, 2015 at 1:59 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> > On 2015-Apr-20, at 13:28, Luqman Aden <me+llvm at luqman.ca> wrote:
> >
> > From df1b11c561dc826d8e4836b9e910ac75185bcf05 Mon Sep 17 00:00:00 2001
> > From: Luqman Aden <luqman at apple.com>
> > Date: Wed, 15 Apr 2015 12:22:32 -0700
> > Subject: [PATCH] llvm-link: Add -override flag to prefer one of a duplicate
> >  symbol.
> >
> > ---
> >  include/llvm/Linker/Linker.h       |  4 +++-
> >  lib/Linker/LinkModules.cpp         | 20 +++++++++++++++----
> >  test/tools/llvm-link/Inputs/foo.ll |  4 ++++
> >  test/tools/llvm-link/hello.ll      | 20 +++++++++++++++++++
> >  test/tools/llvm-link/lit.local.cfg |  1 +
> >  tools/llvm-link/llvm-link.cpp      | 40 ++++++++++++++++++++++++--------------
> >  6 files changed, 69 insertions(+), 20 deletions(-)
> >  create mode 100644 test/tools/llvm-link/Inputs/foo.ll
> >  create mode 100644 test/tools/llvm-link/hello.ll
> >  create mode 100644 test/tools/llvm-link/lit.local.cfg
> >
> 
> > From df1b11c561dc826d8e4836b9e910ac75185bcf05 Mon Sep 17 00:00:00 2001
> > From: Luqman Aden <luqman at apple.com>
> > Date: Wed, 15 Apr 2015 12:22:32 -0700
> > Subject: [PATCH] llvm-link: Add -override flag to prefer one of a duplicate
> >  symbol.
> 
> Luqman and I talked in person, and he added some context for this (seems
> to still be missing from llvm-commits?).  The idea is to support an LLVM
> developer workflow where a subset of the module is extracted to a side
> module to be manipulated or optimized differently.  Sometime before
> CodeGen, these functions are merged back in, and should replace the
> copies in the "main" module (regardless of IR linkage types).
> 
> This seems like a pretty useful debugging workflow to me, and the
> approach in this patch seems about right.
> 
> @Rafael, I'd appreciate your opinion on this.
> 
> >
> > ---
> >  include/llvm/Linker/Linker.h       |  4 +++-
> >  lib/Linker/LinkModules.cpp         | 20 +++++++++++++++----
> >  test/tools/llvm-link/Inputs/foo.ll |  4 ++++
> >  test/tools/llvm-link/hello.ll      | 20 +++++++++++++++++++
> >  test/tools/llvm-link/lit.local.cfg |  1 +
> >  tools/llvm-link/llvm-link.cpp      | 40 ++++++++++++++++++++++++--------------
> >  6 files changed, 69 insertions(+), 20 deletions(-)
> >  create mode 100644 test/tools/llvm-link/Inputs/foo.ll
> >  create mode 100644 test/tools/llvm-link/hello.ll
> >  create mode 100644 test/tools/llvm-link/lit.local.cfg
> >
> > diff --git a/include/llvm/Linker/Linker.h b/include/llvm/Linker/Linker.h
> > index 5ca815c..c43b90e 100644
> > --- a/include/llvm/Linker/Linker.h
> > +++ b/include/llvm/Linker/Linker.h
> > @@ -68,8 +68,10 @@ public:
> >    void deleteModule();
> >
> >    /// \brief Link \p Src into the composite. The source is destroyed.
> > +  /// Passing OverrideSymbols as true will have symbols from Src
> > +  /// shadow those in the Dest.
> >    /// Returns true on error.
> > -  bool linkInModule(Module *Src);
> > +  bool linkInModule(Module *Src, bool OverrideSymbols = false);
> >
> >    /// \brief Set the composite to the passed-in module.
> >    void setModule(Module *Dst);
> > diff --git a/lib/Linker/LinkModules.cpp b/lib/Linker/LinkModules.cpp
> > index 21edc50..3a46a33 100644
> > --- a/lib/Linker/LinkModules.cpp
> > +++ b/lib/Linker/LinkModules.cpp
> > @@ -424,12 +424,17 @@ class ModuleLinker {
> >
> >    DiagnosticHandlerFunction DiagnosticHandler;
> >
> > +  /// For symbol clashes, prefer those from Src.
> > +  bool OverrideFromSrc;
> > +
> >  public:
> >    ModuleLinker(Module *dstM, Linker::IdentifiedStructTypeSet &Set, Module *srcM,
> > -               DiagnosticHandlerFunction DiagnosticHandler)
> > +               DiagnosticHandlerFunction DiagnosticHandler,
> > +               bool Override = false)
> 
> Why spell this differently from the class member?
> 
> Also, are there any callers that don't specify this flag?  If not, don't
> provide a default.
> 
> Updated.
>  
> 
> >        : DstM(dstM), SrcM(srcM), TypeMap(Set),
> >          ValMaterializer(TypeMap, DstM, LazilyLinkGlobalValues),
> > -        DiagnosticHandler(DiagnosticHandler) {}
> > +        DiagnosticHandler(DiagnosticHandler),
> > +        OverrideFromSrc(Override) {}
> >
> >    bool run();
> >
> > @@ -725,6 +730,12 @@ bool ModuleLinker::getComdatResult(const Comdat *SrcC,
> >  bool ModuleLinker::shouldLinkFromSource(bool &LinkFromSrc,
> >                                          const GlobalValue &Dest,
> >                                          const GlobalValue &Src) {
> > +  // Should we unconditionally use the Src?
> > +  if (OverrideFromSrc) {
> > +    LinkFromSrc = true;
> > +    return false;
> > +  }
> > +
> >    // We always have to add Src if it has appending linkage.
> >    if (Src.hasAppendingLinkage()) {
> >      LinkFromSrc = true;
> > @@ -794,6 +805,7 @@ bool ModuleLinker::shouldLinkFromSource(bool &LinkFromSrc,
> >    assert(!Dest.hasExternalWeakLinkage());
> >    assert(Dest.hasExternalLinkage() && Src.hasExternalLinkage() &&
> >           "Unexpected linkage type!");
> > +
> 
> This seems unrelated to the patch?
> 
> Yes, whitespace left over from debugging before. I've now removed it.
>  
> 
> >    return emitError("Linking globals named '" + Src.getName() +
> >                     "': symbol multiply defined!");
> >  }
> > @@ -1742,9 +1754,9 @@ void Linker::deleteModule() {
> >    Composite = nullptr;
> >  }
> >
> > -bool Linker::linkInModule(Module *Src) {
> > +bool Linker::linkInModule(Module *Src, bool OverrideSymbols) {
> >    ModuleLinker TheLinker(Composite, IdentifiedStructTypes, Src,
> > -                         DiagnosticHandler);
> > +                         DiagnosticHandler, OverrideSymbols);
> >    bool RetCode = TheLinker.run();
> >    Composite->dropTriviallyDeadConstantArrays();
> >    return RetCode;
> > diff --git a/test/tools/llvm-link/Inputs/foo.ll b/test/tools/llvm-link/Inputs/foo.ll
> > new file mode 100644
> > index 0000000..6e06fa5
> > --- /dev/null
> > +++ b/test/tools/llvm-link/Inputs/foo.ll
> > @@ -0,0 +1,4 @@
> > +define i32 @foo(i32 %i) {
> > +entry:
> > +  ret i32 4
> > +}
> 
> We usually put lib/Linker tests in "test/Linker".
> 
> > diff --git a/test/tools/llvm-link/hello.ll b/test/tools/llvm-link/hello.ll
> > new file mode 100644
> > index 0000000..c124864
> > --- /dev/null
> > +++ b/test/tools/llvm-link/hello.ll
> > @@ -0,0 +1,20 @@
> > +; RUN: llvm-as %s -o %t-hello.bc
> > +; RUN: llvm-as %p/Inputs/foo.ll -o %t-foo.bc
> 
> Why not just use llvm-link directly?
> 
> I initially thought it wanted .bc only and just hadn't updated the test there.
>  
> 
> Also, I recommend this naming scheme:
> 
>     test/Linker/manual-override.ll
>     test/Linker/Inputs/manual-override.ll
> 
> (Or whatever you want to name the tests, but I wouldn't call it "hello".)
> 
> I'd also use `%S` instead of `%p`, since last I checked only the
> former is actually documented at llvm.org.
> 
> Also, it might be worth testing both command-line orderings.
> 
> Added.
>  
> 
> > +; RUN: llvm-link %t-hello.bc -override %t-foo.bc -S | FileCheck %s
> > +
> > +
> > +; CHECK-LABEL: define i32 @foo
> > +; CHECK-NEXT: entry:
> > +; CHECK-NEXT: ret i32 4
> > +define i32 @foo(i32 %i) {
> > +entry:
> > +  %add = add nsw i32 %i, %i
> > +  ret i32 %add
> > +}
> > +
> > +; Function Attrs: nounwind ssp uwtable
> > +define i32 @main(i32 %argc, i8** %argv) {
> > +entry:
> > +  %a = call i32 @foo(i32 2)
> > +  ret i32 %a
> > +}
> > diff --git a/test/tools/llvm-link/lit.local.cfg b/test/tools/llvm-link/lit.local.cfg
> > new file mode 100644
> > index 0000000..c6106e4
> > --- /dev/null
> > +++ b/test/tools/llvm-link/lit.local.cfg
> > @@ -0,0 +1 @@
> > +config.suffixes = ['.ll']
> 
> Since the tests should probably be moved to `test/Linker`, we probably
> don't need this.
> 
> > diff --git a/tools/llvm-link/llvm-link.cpp b/tools/llvm-link/llvm-link.cpp
> > index 6924aa5..54d49f5 100644
> > --- a/tools/llvm-link/llvm-link.cpp
> > +++ b/tools/llvm-link/llvm-link.cpp
> > @@ -38,6 +38,11 @@ static cl::list<std::string>
> >  InputFilenames(cl::Positional, cl::OneOrMore,
> >                 cl::desc("<input bitcode files>"));
> >
> > +static cl::list<std::string>
> > +OverridingInputs("override", cl::ZeroOrMore, cl::value_desc("filename"),
> > +                 cl::desc("input bitcode file which can "
> > +                          "override previously defined symbol"));
> > +
> >  static cl::opt<std::string>
> >  OutputFilename("o", cl::desc("Override output filename"), cl::init("-"),
> >                 cl::value_desc("filename"));
> > @@ -109,24 +114,29 @@ int main(int argc, char **argv) {
> >    auto Composite = make_unique<Module>("llvm-link", Context);
> >    Linker L(Composite.get(), diagnosticHandler);
> >
> > -  for (unsigned i = 0; i < InputFilenames.size(); ++i) {
> > -    std::unique_ptr<Module> M = loadFile(argv[0], InputFilenames[i], Context);
> > -    if (!M.get()) {
> > -      errs() << argv[0] << ": error loading file '" <<InputFilenames[i]<< "'\n";
> > -      return 1;
> > -    }
> > +  auto linkFiles = [&](const cl::list<std::string>& Files, bool Override) {
> > +    for (auto File : Files) {
> > +      std::unique_ptr<Module> M = loadFile(argv[0], File, Context);
> > +      if (!M.get()) {
> > +        errs() << argv[0] << ": error loading file '" << File << "'\n";
> > +        exit(1);
> > +      }
> >
> > -    if (verifyModule(*M, &errs())) {
> > -      errs() << argv[0] << ": " << InputFilenames[i]
> > -             << ": error: input module is broken!\n";
> > -      return 1;
> > -    }
> > +      if (verifyModule(*M, &errs())) {
> > +        errs() << argv[0] << ": " << File
> > +               << ": error: input module is broken!\n";
> > +        exit(1);
> > +      }
> >
> > -    if (Verbose) errs() << "Linking in '" << InputFilenames[i] << "'\n";
> > +      if (Verbose) errs() << "Linking in '" << File << "'\n";
> >
> > -    if (L.linkInModule(M.get()))
> > -      return 1;
> > -  }
> > +      if (L.linkInModule(M.get(), Override))
> > +        exit(1);
> > +    }
> > +  };
> > +
> > +  linkFiles(InputFilenames, false);
> 
> It'd be nice for the refactoring part of this (i.e., the part that
> changes all the indentation) to be in a separate NFC (no functionality
> change) commit so it's easier to see what has really changed.
> 
> Done.

Ah, but the order seems wrong.

In this case, anyway, putting the NFC commit first gives you two
distinct changes: (1) refactor to make the code reusable, (2) actually
reuse the code, adding a flag and condition, etc.  Each commit stands
on its own and is easy to post-commit review (first commit, refactoring
noise; second commit, small functional change).

The order you chose (NFC second) adds the duplicate/different code and
then cleans it up.  The first patch -- the one with real changes -- is
*bigger* than your original patch, whereas my goal was to minimize the
noise around the functional changes.  Please reverse the commits.

> +  auto linkFiles = [&](const cl::list<std::string>& Files, bool Override) {

Since I've gotten all nit-picky here anyway... I wonder whether using a
lambda function is actually useful here.  It saves passing `argv[0]` and
`Context`, but I'm not sure that's buying much over just creating a
normal `static` function.  The latter seems just as clear, or maybe even
clearer.  I don't have a strong opinion; it's up to you.

> Subject: [PATCH 1/2] [llvm-link] Add -override flag to prefer one of a
>  duplicate symbol.

Since you've attached patches from git, I've had a look at the commit
messages, and the functional one in particular doesn't really explain
the motivation.

When you actually commit this you should describe the workflow that
you're planning to use this in, or otherwise explain why this flag is
useful.

(Do you have commit access?  If not, let me know when you send the next
version of the patch and I'll commit it for you.  In that case I can
just reuse my own writeup from earlier in the thread if you want.)

> ---
>  include/llvm/Linker/Linker.h          |  4 +++-
>  lib/Linker/LinkModules.cpp            | 19 +++++++++++++++----
>  test/Linker/Inputs/manual-override.ll |  4 ++++
>  test/Linker/manual-override.ll        | 19 +++++++++++++++++++
>  tools/llvm-link/llvm-link.cpp         | 26 +++++++++++++++++++++++++-

The testcases look a little light.  Can you confirm what should happen
if the linkage types differ?  For example, if the -override is `linkonce`
and the normal function is `weak`, IIUC the code will choose the
`linkonce` function (thusly ignoring linkage rules).  This makes sense to
me -- that's what -override sounds like it should do -- but you should
test it.

Also, a test confirming that functions with `internal` linkage are
unaffected makes sense to me.  I.e., if you have `internal foo` in the
main file, and normal (external) linkage in the -override file, the
`internal foo` should be renamed out of the way instead of replaced.
Similarly, an `internal foo` from the -override file should get renamed
on collisions.  (It looks like this is what will happen already, and it
seems right to me, but I think it deserves a test.)

Finally, one whitespace bug below:

> diff --git a/tools/llvm-link/llvm-link.cpp b/tools/llvm-link/llvm-link.cpp
> index 485a7b2..54d49f5 100644
> --- a/tools/llvm-link/llvm-link.cpp
> +++ b/tools/llvm-link/llvm-link.cpp
> @@ -114,43 +114,29 @@ int main(int argc, char **argv) {
>    auto Composite = make_unique<Module>("llvm-link", Context);
>    Linker L(Composite.get(), diagnosticHandler);
>  
> -  for (unsigned i = 0; i < InputFilenames.size(); ++i) {
> -    std::unique_ptr<Module> M = loadFile(argv[0], InputFilenames[i], Context);
> -    if (!M.get()) {
> -      errs() << argv[0] << ": error loading file '" <<InputFilenames[i]<< "'\n";
> -      return 1;
> +  auto linkFiles = [&](const cl::list<std::string>& Files, bool Override) {

`const cl::list<std::string>& Files` should be
`const cl::list<std::string> &Files`.  I recommend adding clang-format-diff
to your workflow:

http://clang.llvm.org/docs/ClangFormat.html#script-for-patch-reformatting

> +    for (auto File : Files) {
> +      std::unique_ptr<Module> M = loadFile(argv[0], File, Context);
> +      if (!M.get()) {
> +        errs() << argv[0] << ": error loading file '" << File << "'\n";
> +        exit(1);
> +      }
> 


> > +  linkFiles(OverridingInputs, true);
> >
> >    if (DumpAsm) errs() << "Here's the assembly:\n" << *Composite;
> >
> > --
> > 2.3.0 (Apple Git-54)
> >
> >
> 
> 
> 
> 
> <0001-llvm-link-Add-override-flag-to-prefer-one-of-a-dupli.patch><0002-llvm-link-Factor-out-code-for-handling-regular-and-o.patch>





More information about the llvm-commits mailing list