[llvm-commits] [patch][gold plugin] Don't internalize symbols in objects that we will use as pass-through

Nick Lewycky nicholas at mxc.ca
Tue Jun 22 22:29:41 PDT 2010


Rafael Espindola wrote:
> Cary, the patch is for LLVM, but I am ccing you in case you think this
> should be done in gold somehow.
>
> This is hopefully the last patch is the libgcc and its dependencies saga.
>
> The last remaining issue that i know is that gold can ask us to load a
> file because it is used only on the IR. We then internalize the
> symbols in that file and run codegen. Codegen can then create new
> undefined references to those files that forces gold to fetch them
> again (from the native code this time).

To restate, the scenario is:
  1 you've got memcpy defined in IR and uses only in IR
  2 so you internalize memcpy and inline it out of existence
  3 then LLVM codegen emits a call to memcpy since it detected a copy 
loop and assumes libc/libgcc will always be there
  4 you want gold to link against native libc/libgcc that weren't in the 
original link line

So things went awry at step 3 really, but I'm going to assume that 
you've already considered and rejected removing all libcalls from the 
backend.

Firstly, do you really need to do this? Why do you need to link against 
libgcc/libc as bitcode if you're willing to link to them as native code 
after all?

> There are a number of problem with this
> *) Gold doesn't implement this all that well. It still has in its
> symbol table that it loaded a file defining that symbol.
> *) We would end up with multiple copies of some symbols.
> *) LLVM can use smaller chunks of files than gold. Consider the case
> of a file defining functions foo and bar. Function foo is used from
> elf, and so we don't internalize it. Function bar is not used at that
> time and we drop it. Now codegen introduces an undefined reference to
> bar. What should gold do? Bringing in that file will fail because we
> will have two visible definitions of foo. Not doing so will fail
> because there is no where else to find bar.
>
> The best solution I could find is to disable internalize for any
> functions defined in a library that is passed through. The attached
> patch does this. It can be optimized, but I am not sure if that is
> worth it, since normally there are only two libraries being passed
> through (libgcc and libc).

+  sys::Path path = sys::Path(file->name);
+  cf.pass_through = false;
+  if (path.getSuffix() == "a") {
+    llvm::StringRef basename = path.getBasename();
+    if (basename.startswith("lib")) {
+      llvm::StringRef libname = basename.substr(strlen("lib"));
+      for (std::vector<std::string>::iterator i = 
options::pass_through.begin(),
+                                              e = 
options::pass_through.end();
+           i != e; ++i) {
+        llvm::StringRef item = *i;
+        if (item.startswith("-l") &&
+            item.substr(strlen("-l")) == libname)
+          cf.pass_through = true;
+      }
+    }
+  }

Please factor this into a helper function that takes file->name and 
returns whether or not it was found to be one of the pass_through libraries.

Nick




More information about the llvm-commits mailing list