[PATCH] Linker: Replace overridden subprograms

Duncan P. N. Exon Smith dexonsmith at apple.com
Tue Dec 16 14:18:51 PST 2014


> On 2014-Dec-16, at 12:49, David Blaikie <dblaikie at gmail.com> wrote:
> 
> Should "OverridingFunctions" be "OverriddenFunctions"?

Nope, these are the overriding functions.  The functions they've
overridden have been deleted and no longer exist.

> (it looks like everything in that set gets dropped from the subprogram lists - so if those are the functions chosen to be kept by the linker, I would think that would end up with them all being dropped from the debug info - the exact opposite of what's desired - but I assume I'm misunderstanding something)

The timing is important.

When this function is called, `OldF->replaceAllUsesWith(NewF)` has been
called, so all the old subprograms will be pointing at the NewF.

However, the named metadata hasn't been linked yet, so the new compile
units aren't yet in `DstM`.

> 
> (stylistically I'd remove a bunch of the conditionals in that code as not too important: if the function set is empty, everything else should be fairly trivial, I don't necessarily know that the early return is particularly valuable.

This function gets called each time a module gets linked in.
Unnecessarily traversing all the compile units and their subprograms is
wasteful.

Moreover, this code is quadratic in the worst case, since it gets called
once for every module that's linked in (and it traverses all the compile
units added so far).  In practice, most `SrcM` won't have functions that
have overridden something in `DstM`.  Returning early here is important.

> Two calls to getNamedMetadata (& two checks of the same non-null result)

Actually, these are two different results.  One is for the `SrcM`, and
the other is for the `DstM`.  If there is no compile unit in the `SrcM`,
then I'm leaving the subprograms as they are the `DstM`, since there's no
collision.

> . CUs in the CU list should never be null, the subprogram array should never be null (I think that's the case?).

This might be true... I can change these to assertions.  However
depending on how I resolve Rafael's concern about symmetry, there could
be nulls in the array of subprograms.  Do you think that's a problem?

(Still, I'd rather add the code with conditions and switch it to
assertions in a separate commit, so that if the assertions fire we know
it's unrelated to the other semantic changes.)

> & I might even not bother with the old size != new size condition either - just to keep the code simple/easy to read)

Calling `replaceSubprograms()` creates a lot of function traffic, and
`MDNode::get()` isn't guaranteed to return the same `MDNode` (since the
one there isn't guaranteed to be in uniquing mode).  I'd rather save
memory and time, since LTO is already bad enough.

> 
> - David
> 
> On Tue, Dec 16, 2014 at 12:14 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> > On 2014 Dec 16, at 10:27, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Tue, Dec 16, 2014 at 10:06 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >
> > > On 2014-Dec-16, at 10:00, David Blaikie <dblaikie at gmail.com> wrote:
> > >
> > >
> > >
> > > On Tue, Dec 16, 2014 at 9:48 AM, Rafael EspĂ­ndola <rafael.espindola at gmail.com> wrote:
> > > > It's possible to put DWARF subprogram definitions for linkonce_odr functions
> > > > into comdat groups so they are dropped along with the linkonce_odr function,
> > > > but Clang (& I don't think GCC by default) does that (there's some overhead
> > > > to that representational choice, etc). But yes, it can be done. I think
> > > > today, for both GCC and Clang, you'd end up with two subprogram definitions
> > > > (one in each DWARF compile unit) each pointing to the same high_pc/low_pc to
> > > > describe the function.
> > >
> > > I thought this was part of a special case for dwarf in linkers (and
> > > the missing feature why lld produced binaries are much larger in -g
> > > builds).
> > >
> > > To the best of my knowledge, at least the usual Linux linkers (gold and binutils-ld) don't have any special cases for DWARF (well, gold has some magic to generate a GDB debug info index, maybe), it's just sections and relocations like any other data in an object file. /maybe/ dsymutil on MacOS does, but I've not heard about it. (& possibly dwz - a debug info-aware compression tool could do that trick, it's designed to eliminate redundancy in debug info)
> > >
> > > A basic test case shows roughly what I described. (attached dwarfdump, if you're curious - you can see what the source would've looked like from the DWARF there (inline function in a header, two translation units that instantiate the inline function, etc))
> > >
> > >
> > > > (but yeah, vaguely sounds like what you're suggesting would make sense - I
> > > > haven't looked at the initial proposed patch yet, though)
> > >
> > > Cheers,
> > > Rafael
> > > <dump.txt>
> >
> > Even if other linkers don't do it, is it reasonable to delete subprograms from
> > compile units if the canonical one is from another compile unit (and has a
> > subprogram there)?
> >
> > I think so - more importantly, I assume that's what we'll end up doing anyway. (check the actual DWARF output - since we build a map of subprograms, I don't think we end up emitting the subprogram to both CUs anyway - we just ignore all but the first one we find - that's my bet anyway)
> >
> >
> > If so, I can just do that.
> 
> How does this look?
> 





More information about the llvm-commits mailing list