[LLVMdev] Need to create symbols only once

Nick Kledzik kledzik at apple.com
Mon Dec 10 18:58:28 PST 2012


On Dec 10, 2012, at 9:19 AM, Shankar Easwaran wrote:
> Thanks for the reply Nick.
> 
> I will use the Writer::addFiles functionality. Do you want to move the SimpleFile class to lld/Core ?
If others have use for it, we can move it out of lib/ReaderWriter/MachO.  I'm not sure if it makes sense in include/lld/Core because those are interfaces external clients would use.  The SimpleFile stuff is really just something that other ReaderWriters might use.  So, perhaps it could go in lib/ReaderWriter  (Thoughts Michael?)


> 
> It might be useful for other types of object files too(like for ELF here).
> 
> How does typeFirstInSection/typeLastinSection know that the addresses that need to be used for those symbols are the symbol values for the section start / section end ?

The typeFirstInSection comes from the darwin linker.  The model is that the Order Pass (which does not exist yet in lld) sorts atoms and it will sort typeFirstInSection atoms to the start of their section.    The problem is that in lld, "sections" don't exist in core linking.  Real named sections are only assigned when you get to the Writer.  So, a Pass could not do this sorting.    The general idea still holds, we just need to adjust it for ld64.  Here is my proposal:

We add a new attribute to DefinedAtoms:

   enum SectionPosition {
     sectionPositionLowest
     sectionPositionLow,
     sectionPositionAny,
     sectionPositionHigh,
     sectionPositionHighest
   };

   virtual SectionPosition sectionPosition();

For most atoms, the value return for sectionPosition() will be sectionPositionAny.  For atoms that must be pinned to the start of a section, they return sectionPositionLowest.  We should also assert that any sectionPositionLowest atom must also be zero length.  That is to allow multiple names for the start of the section.  The sectionPositionLow and sectionPositionHigh are a way to mark atoms that prefer to be towards the start or end of their section.   

With this in place, we can add an OrderPass which sorts atoms by contentType and within a group of atoms of the same type, they are sorted by sectionPosition, and of course, within a sectionPosition they are sorted by .o file order and order with .o file.

-Nick

> 
> I didnt see references to typeFirstInSection/typeLastInSection in the MachO part of lld too, any pointers to how you are doing that will be helpful.
> 
> If not, I need to duplicate that piece of code, which doesnot make sense.
> 
> Thanks
> 
> Shankar Easwaran
> 
> On 12/7/2012 4:59 PM, Nick Kledzik wrote:
>> On Dec 7, 2012, at 11:51 AM, Shankar Easwaran wrote:
>>> We have few symbols like __bss_start, __bss_end, which are Undefined symbols in the code.
>>> 
>>> I want a way in the Reader to create specific atoms before the linker bootstraps.
>>> 
>>> I didnt find a way to do that with the existing interfaces.
>>> 
>>> The way it needs to work is as below :-
>>> 
>>> 1) ReaderELF creates Absolute symbols (for __bss_start, __bss_end etc)
>>> 2) ReaderELF reads each file and adds Atoms to the list
>>> 3) If the atoms the linker defined were Global, the atoms that the Reader created should get overridden with the linker created ones.
>>> 
>>> This may also be needed to pul in specific symbols from archive libraries, too.
>>> 
>>> I was thinking to add an interface to ReaderELF which would be called by the driver, but the problem is the DefinedAtom/AbsoluteAtom have which file owns the atom.
>>> 
>>> I was discussing with Michael on this, and he was proposing to add a Pre-Read file.
>>> 
>>> Do you have any other opinions too ?
>> We have a similar requirement in darwin's ld64 linker, but even more general.  Any binary can do the following to introspect itself:
>> 
>> struct stuff { int a; int b; };
>> 
>> extern struct stuff*  stuff_start  __asm("section$start$__DATA$__my");
>> extern struct stuff*  stuff_end   __asm("section$end$__DATA$__my");
>> 
>> void examineSection() {
>> 	const struct stuff* p;
>> 	for (p = stuff_start; p < stuff_end; ++p) {
>> 		// do stuff with p
>> 	}
>> }
>> 
>> That is, there are magic symbol names which reference the beginning or ending of any particular section.   To support this, the linker lazily creates atoms when references to these magic symbols are discovered during resolving.
>> 
>> I have some hooks for this already in place in lld:
>> 
>> 1) There is Writer::addFiles().  This method gives any writer a change to add files/atoms to the set of atoms the Resolver works on.   The Writer::addFiles() method is called after all input files are added.  If you want to add something lazily (like darwin linker does for section$start$ symbols), the writer returns a File object akin to a static library.  That it, it provides no initial atoms, but can provide atoms as a last resort (so an .o files would override it).  The WriterMachO already uses the addFiles() method to add CRuntime symbols.
>> 
>> 2) DefinedAtom::ContentType already has typeFirstInSection and typeLastInSection.  These are intended to be used for the content type of the atoms which represent the magic symbols for the start and end of a section.  The key here is that the Pass (not written yet) which sorts atoms, knows to sort these atoms to the start or end of their respective sections.
>> 
>> If you don't want this full general, lazy approach, you could have your WriteELF::addFiles() return a regular object file that has atoms named __bss_start and __bss_end, but they are marked mergeAsWeak so that any user defined atoms will override them.
>> 
>> -Nick
>> 
>> 
>> 
>> 
> 
> 
> -- 
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
> 




More information about the llvm-dev mailing list