[LLVMdev] Handling ELF groups.

Nick Kledzik kledzik at apple.com
Wed Dec 19 18:43:03 PST 2012


On Dec 19, 2012, at 6:26 PM, Shankar Kalpathi Easwaran wrote:

> I support Nick's option too. I think handling groups is another example of using follow on references. 
> 
> One question is how does an atom outside the group refer to the main atom here ? Will not garbage collection cleanup the main atom/signature atom because there are no references ?
Well, if there are no references, it should be dead stripped, right?  

A typical use of group COMDAT is that you have a function with an inline definition and that function has a static local variable.  You have two atoms: the function atom and an atom for the data variable.  They are bound together in a group.  Meaning, either they both are used, or neither is used.  The "signature" of the group is the (mangled) name of the function.  If nothing is using that function, and the resolver it told to dead strip, it will remove the function and variable.  If another object file also defines the function (and variable), one copy of function will be coalesced away (mergeAsWeak) which also (because of the group reference) also coalesces away its variable atom copy.

-Nick

> 
> On Wed, Dec 19, 2012 at 5:00 PM, Nick Kledzik <kledzik at apple.com> wrote:
> On Dec 19, 2012, at 4:53 PM, Michael Spencer wrote:
> > On Wed, Dec 19, 2012 at 4:43 PM, Nick Kledzik <kledzik at apple.com> wrote:
> >>
> >> On Dec 19, 2012, at 4:25 PM, Michael Spencer wrote:
> >>> So I was looking into handling ELF groups today in the Atom model. It
> >>> appears that we will need to add the concept of a group to the atom
> >>> model directly, as modeling it with references fails to capture some
> >>> semantics.
> >>>
> >>> http://www.sco.com/developers/gabi/latest/ch4.sheader.html
> >>>
> >>> Groups in ELF are collections of sections that must be either included
> >>> or excluded as a unit.
> >> I thought groups were a collection of symbol - not sections.  Is this a case
> >> where there is one symbol per section?
> >
> > It's sections. There is no restriction on symbols in a group section.
> >
> >>
> >>> They also are used to implement COMDAT. Each
> >>> group has an "identifying symbol entry" or "group signature". This is
> >>> only used in the case of COMDAT groups (which are marked with a flag).
> >>> When two COMDAT groups have the same group signature the linker must
> >>> select one (not specified how to pick) and discard _all_ members of
> >>> the other group.
> >>>
> >>> Correctly implementing this requires knowing the group name for each
> >>> group and having the resolver remove the correct set of atoms on
> >>> collision. We also need to be able to explicitly track the identifying
> >>> symbol entry for the relocatable case.
> >>
> >> In the darwin linker this is solved using references.   The "signature" atom in
> >> a group has a "group-subordinate" reference to each atom in the group.
> >> When an atom is coalesced away, its references are scanned and the
> >> target of any group-subordinate reference is also coalesced.
> >>
> >> Conceptually, a group is just a circle around some set of atoms.  That same
> >> information can be represented as a connected graph.  That is, by introducing
> >> a zero size "master " atom with reference to each atom in the group.  In the special
> >> case of group comdat, the signature atom can be used as the master.
> >>
> >> In other words, I'm not convinced of the need to introduce a new top level class
> >> (Group) to go along with Atom and Reference.  I believe we can encode
> >> the same information using references.
> >>
> >> -Nick
> >
> > Ok, I kinda see how this can work. The only thing I'm still confused
> > about is conforming to this part of the ELF spec:
> >
> > "This is a COMDAT group. It may duplicate another COMDAT group in
> > another object file, where duplication is defined as having the same
> > group signature. In such cases, only one of the duplicate groups may
> > be retained by the linker, and the members of the remaining groups
> > must be discarded."
> >
> > How do we know that a group master is a COMDAT group master as opposed
> > to a normal group master?
> A COMDAT group master has a real, named atom as its master.    The other
> groups will have a zero size master atom with some special content type
> (e.g.  typeGroupMaster).
> 
> For COMDAT groups, the "group signature" is the name of the signature
> (master) atom.   If two .o files each have a COMDAT group with the
> same signature, that means they each have a master atom with the same
> name.
> 
> -Nick
> 
> 
> >>>
> >>> An idea for implementing this would be to add a list of Groups to each
> >>> File. I don't believe a Group should be an atom as it has different
> >>> semantics and would have to be treated specially everywhere.
> >>>
> >>> A group would have a name, merge attribute, and a list of atoms it contains.
> >>>
> >>> YAML mockup:
> >>>
> >>> ---
> >>> groups:
> >>> - name: _Z4funcIiET_S0_
> >>>   merge: pickAny
> >>>   members: [_Z4funcIiET_S0_, ".debug._Z4funcIiET_S0_"]
> >>>
> >>> atoms:
> >>> - name _Z4funcIiET_S0_
> >>>   scope: global
> >>>   merge: asWeak
> >>>   type: code
> >>> ...
> >>>
> >>> The main problem I see with this is that groups are no longer
> >>> represented explicitly in the reference graph.
> >>>
> >>> - Michael Spencer
> >>
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121219/87521f77/attachment.html>


More information about the llvm-dev mailing list