[LLVMdev] Need to create symbols only once

Nick Kledzik kledzik at apple.com
Fri Dec 7 14:59:18 PST 2012

On Dec 7, 2012, at 11:51 AM, Shankar Easwaran wrote:
> We have few symbols like __bss_start, __bss_end, which are Undefined symbols in the code.
> I want a way in the Reader to create specific atoms before the linker bootstraps.
> I didnt find a way to do that with the existing interfaces.
> The way it needs to work is as below :-
> 1) ReaderELF creates Absolute symbols (for __bss_start, __bss_end etc)
> 2) ReaderELF reads each file and adds Atoms to the list
> 3) If the atoms the linker defined were Global, the atoms that the Reader created should get overridden with the linker created ones.
> This may also be needed to pul in specific symbols from archive libraries, too.
> I was thinking to add an interface to ReaderELF which would be called by the driver, but the problem is the DefinedAtom/AbsoluteAtom have which file owns the atom.
> I was discussing with Michael on this, and he was proposing to add a Pre-Read file.
> Do you have any other opinions too ?

We have a similar requirement in darwin's ld64 linker, but even more general.  Any binary can do the following to introspect itself:

struct stuff { int a; int b; };

extern struct stuff*  stuff_start  __asm("section$start$__DATA$__my");
extern struct stuff*  stuff_end   __asm("section$end$__DATA$__my");

void examineSection() {
	const struct stuff* p;
	for (p = stuff_start; p < stuff_end; ++p) {
		// do stuff with p

That is, there are magic symbol names which reference the beginning or ending of any particular section.   To support this, the linker lazily creates atoms when references to these magic symbols are discovered during resolving.  

I have some hooks for this already in place in lld:

1) There is Writer::addFiles().  This method gives any writer a change to add files/atoms to the set of atoms the Resolver works on.   The Writer::addFiles() method is called after all input files are added.  If you want to add something lazily (like darwin linker does for section$start$ symbols), the writer returns a File object akin to a static library.  That it, it provides no initial atoms, but can provide atoms as a last resort (so an .o files would override it).  The WriterMachO already uses the addFiles() method to add CRuntime symbols.  

2) DefinedAtom::ContentType already has typeFirstInSection and typeLastInSection.  These are intended to be used for the content type of the atoms which represent the magic symbols for the start and end of a section.  The key here is that the Pass (not written yet) which sorts atoms, knows to sort these atoms to the start or end of their respective sections.  

If you don't want this full general, lazy approach, you could have your WriteELF::addFiles() return a regular object file that has atoms named __bss_start and __bss_end, but they are marked mergeAsWeak so that any user defined atoms will override them.


More information about the llvm-dev mailing list