[rfc][gold plugin] Fix pr19901

Mon Aug 11 17:09:28 PDT 2014

On 11 August 2014 15:24, Rafael Espíndola <rafael.espindola at gmail.com>
wrote:

> ping. Rebased patch attached.
>

Okay. So at a high level, you've decided not to use libLTO any more. Funny
aside, the first version of the gold plugin I put in tools/silver and it
was written to use the llvm C++ API directly. I think it was Chris who told
me that this was silly and I should be reusing the logic from libLTO to do
it, so I started a second version in tools/gold.

I'm very sympathetic to the idea of not fixing libLTO and letting it rot
(oops, I meant to say letting Apple maintain it for ld64), though not doing
that would make ultimately make libLTO stronger. I'm calling this out as a
deliberate choice we have to make, we can either push libLTO to being well
enough designed to handle linkers with very different plugin ideologies
(and thus useful for others), or we can consign it to being an ld64
feature. If gold can't use libLTO because its interface is too limiting,
it's likely lld won't either, that any third party linkers won't, etc. No I
don't have any suggestions for how to generalize libLTO's API.

+//===- Utils.h - LLVM LTO Utilities
---------------------------------------===//

Missing emacs major mode marker.

+void runLTOPasses(Module &M, TargetMachine &TM, bool DisableOpt,
+                  bool DisableInline, bool DisableGVNLoadPRE);

This should go away and become an interface to PassManagerBuilder instead.
(PMB doesn't actually run the passes, but that's simple for the caller.)

+  ErrorOr<object::IRObjectFile *> ObjOrErr =
+      object::IRObjectFile::createIRObjectFile(std::move(buffer), Context);
+  std::error_code EC = ObjOrErr.getError();
+  if (EC == BitcodeError::InvalidBitcodeSignature)

What about BitcodeError::InvalidBitcodeHeaderWrapper? Come to think of it,
do we even have any tools in mainline that wrap bitcode?

+  for (auto &Sym : Obj->symbols()) {
+    uint32_t Symflags = Sym.getFlags();
+    if (!(Symflags & object::BasicSymbolRef::SF_Global))
+      continue;

Why are we using the libobject interface for something that must be IR?
This line:

+    const GlobalValue *GV = Obj->getSymbolGV(Sym.getRawDataRefImpl());

is horrible. Is this really what it looks like? Type unsafe laundering
through void pointers or the equivalent? "Oh sure buddy, there's a
GlobalValue* there, I promise."

+        assert(!GV->hasExternalWeakLinkage() &&
+               !GV->hasAvailableExternallyLinkage());

Assert message? Also, hold on, couldn't this happen if we have extern_weak
or available_externally in the .bc file? Or is this safe due to behind the
scenes IRObjectFile magic?

+    if (GV && (GV->hasWeakLinkage() || GV->hasLinkOnceLinkage()))
+      sym.comdat_key = (char*)1;

Let me get this straight. The old code called strdup(M->getSymbolName) to
fill sym.name, then upon finding comdat did sym.comday_key = sym.name. The
new code adds a "std::string NameBuffer" per claimed file, then we
raw_ostream the names into that NameBuffer as we visit each symbol and you
mark the ones that need a comdat_key here, then after this you have a loop
that strlen's over NameBuffer to fill in sym.name (and sym.comdat_key if
was set to 1 here).

Why? Firstly this seems like a separable change. Secondly, did this use to
leak, or did you introduce a double-free when gold tries to free it and
then we delete NameBuffer? Third, would a BumpPtrAllocator<char> be better
here? You get slabs that don't individually move, so you can set the
pointer once instead of doing it in a loop after the fact.

+static void keepGlobalValue(GlobalValue &GV) {

I've been pontificating over what it means to have llvm.used name a
function with a discardable linkage. The idea is that it would be one way
to communicate to internalize that it should form weak linkage out of it.
I'm not sure whether that's safe, it means that attribute used would run
the risk of changing linkages.

+  std::vector<GlobalValue*> Drop;

Space before *.

+        // Since we use the regular lib/Linker, we cannot just internalized
+        // it now or it will not be copied to the merged module. Instead
+        // we force it to be copied and then internalize it.

Typo or grammar error starting at "internalized".

+      // We can only check for address uses once we merge the modules. The

s/once/after/ . As written it causes a "garden path" parse.

+    // This is horrible. Given how lazy loading is implemneted, dropping

Typo, "implemneted" -> "implemented". Also, that really is a serious design
flaw. Oof.

+      message(LDPL_FATAL, "Failed to link moudle: %s", ErrMsg.c_str());

Typo, "moudle" -> "module".

+    if (canBeOmittedFromSymbolTable(GV))
+       internalize(*GV);

Three space indent? Please clang-format.

Nick

On 6 August 2014 11:32, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> >> I don't think we drop them yet, but I see it is as just a convenience
> >> so that it is easy to test it with llc. While developing this patch I
> >> did a few clang bootstraps and only once noticed a unused linkonce_odr
> >> in a .bc produced by clang. The previous passes should drop it. We
> >> could add an option to do it, I am just not sure it would have a big
> >> impact (I will write a patch to check).
> >
> > And I can't even reproduce that anymore.
> >
> > I applied the attached patch to llvm to make it assert when writing
> > bitcode or assembly with a dead linkonce_odr. It will correctly crash
> > in llvm-as or llc when given just
> >
> > define linkonce_odr void @g() {
> >   ret void
> > }
> >
> > But it can successfully build a stage2 with or without LTO, so it
> > looks we are pretty good at dropping unused linkonce_odr and dropping
> > it in codegen proper would not add any real value.
> >
> > Cheers,
> > Rafael
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140811/6a980211/attachment.html>