[Patch][pr19848] Produce explicit comdats in clang

Richard Smith richard at metafoo.co.uk
Mon Jan 5 08:22:14 PST 2015


On Mon, Dec 22, 2014 at 4:44 AM, Rafael EspĂ­ndola <
rafael.espindola at gmail.com> wrote:

> Sorry for being late on my own thread. Trying to reply to most comments at
> once:
>
> * I am pretty sure that we want to split the "in a comdat" notion from
> the linkage. I am not married to how this patch does it.
> * On MachO. This should actually improve things. A plain linkonce_odr
> will then mean the same thing on ELF and MachO and a comdat in MachO
> gets rejected early.
> * One Idea I had was having a selfcomdat syntactic sugar in the .ll,
> but I think I like rnk's suggestion better: just omit the $value (and
> flipping the order in the case of globals).
> * For the idea of using external instead of linkonce linkage. I agree
> with David. We should be making these things independent. Keeping the
> linkonce_odr/weak_odr is useful for the 'odr' bit, for saying the
> symbol should be STB_WEAK and for the can/cannot be dropped
> distinction.
>

Yes, I agree these things should be independent. (Note that I think that
STB_WEAK and can/cannot be dropped are orthogonal -- an inline function
definition is discardable, but does not fundamentally need to be STB_WEAK
if we can put it in a comdat.)

* I like the idea of optimizing the case of a select any that has the
> same name as an existing symbol, but it is not that simple in cases
> like
>
> $v = comdat any
> @v = global i32 42 comdat $v
> @bar = global i32 42 comdat $v
>
> When we are building bar, how do we find if there is a "simple comdat
> $v"? We would have to first lookup a GV named $v. When building @bar
> first and @v second we would still need to look at the "normal comdat
> table" when building $v to see if a comdat was already there.
>
> Since the main saving is on the symbol names, another idea would be to
> replace
>
> typedef StringMap<Comdat> ComdatSymTabType;
>
> With
>
> typedef DenseMap<const char *, Comdat> ComdatSymTabType;
>
> And have an extra StringMap with the non-symbol names like C5/D5 (or
> even introduce name -> null mappings in the gv symbol table?).
>
> But it would have the main constraint as the full optimization: when
> dropping the comdat, We have to be sure to drop $v last or we might
> end up we with a pointer to garbage.
>
> So, how about splitting work in:
>
> * llvm: Implement the syntatic change for comdats with the same name
> as the GV, but keep the current representation.
> * clang: this patch, but with less test change noise.
> * llvm: stop producing implicit comdats.
> * llvm: implement and benchmark one of the possible optimizations.


This sounds like a good path forward to me.


> On 20 December 2014 at 19:13, David Majnemer <david.majnemer at gmail.com>
> wrote:
> >
> >
> > On Fri, Dec 19, 2014 at 5:43 PM, Richard Smith <richard at metafoo.co.uk>
> > wrote:
> >>
> >> On Fri, Dec 19, 2014 at 12:51 PM, Rafael EspĂ­ndola
> >> <rafael.espindola at gmail.com> wrote:
> >>>
> >>> There is quiet a bit of history behind this.
> >>>
> >>> The llvm IR until very recently had no support for comdats. This was a
> >>> problem when targeting C++ on ELF/COFF as just using weak linkage
> >>> would cause quiet a bit of dead bits to remain on the executable
> >>> (unless -ffunction-sections, -fdata-sections and --gc-sections were
> >>> used).
> >>>
> >>> To fix the problem, llvm's codegen will just assume that any weak or
> >>> linkonce that is not in an explicit comdat should be output in one
> >>> with the same name as the global.
> >>>
> >>> This unfortunately breaks cases like pr19848 where a weak symbol is
> >>> not expected to be part of any comdat.
> >>>
> >>> Now that we have explicit comdats in the IR, we can finally get both
> >>> cases right.
> >>>
> >>> This first patch just makes clang give explicit comdats to
> >>> GlobalValues where it is allowed to.
> >>>
> >>> A followup patch to llvm will then stop implicitly producing comdats.
> >>
> >>
> >> I'm not sure this is the right direction, but if it is, this change
> looks
> >> reasonable to me.
> >
> >
> > I think this patch is fine, what are your reservations?
> >
> >>
> >>
> >> Assuming we want to go in this direction, I think we should also
> consider
> >> switching from a weak linkage to a strong linkage for entities governed
> by
> >> the ODR; using weak linkage is a hack, since these symbols are really
> not
> >> weak in the usual sense.
> >
> >
> > Maybe but I think that can be considered separately from this change.
> >
> >>
> >> That exposes a hole in our comdat support: we should have the ability to
> >> define a discardable comdat (which need not be emitted if none of its
> >> symbols are used in the current TU); a discardable comdat with an
> external
> >> function definition seems like the right representation for an inline
> >> function.
> >
> >
> > I don't think that we want to abuse linkage like that.  We really need to
> > sit down untangle this mess.  Before comdats, there was never a way to
> say
> > "give me a global, stick it in a COMDAT but *don't* make it weak",
> > COMDAT-ness and weak-ness came hand in hand.
> >
> > Now we have COMDATs and the world of linkage is unchanged, I don't think
> > that this is a good place to be.
> >
> > We really need to kill linkage and replace it with several things, each
> of
> > which individually control the behavior of the global.  Things like
> > @llvm.used are a hack to work around the lack of flexibility linkage has
> to
> > offer.
> >
> > I'm more than happy to help out with this if we can get consensus on a
> > design.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150105/29f82cec/attachment.html>


More information about the cfe-commits mailing list