[LLVMdev] defining symbols with lld

Nick Kledzik kledzik at apple.com
Fri Aug 23 13:40:37 PDT 2013


On Aug 22, 2013, at 6:42 PM, Shankar Easwaran <shankare at codeaurora.org> wrote:
> On 8/22/2013 3:44 PM, Nick Kledzik wrote:
>> 
>> Linker scripts have the same need. My idea for this was to allow atoms to have an associated expression tree and would have references to all symbols in that tree. The resolver wouldn't need any special handling for this, and the backend would just need to evaluate the expression at the end.
>> Agreed.
>> 
>> On one hand, this sounds like an AbsoluteAtom because it has no content and no section, but has an address.  On the other hand, it needs References to the symbols used in its expression and only DefinedAtoms can have content.  But DefinedAtoms are normally laid out and assigned addresses.   One way to work this in would be to make them DefinedAtoms of size zero, with a new ContentType that the ELF Writer knows does not need an address assigned.  The References to each symbol used in its expression ensures that all needed symbols exist and are not dead stripped.  The References will need some Kind the ELF Writer knows is not a relocation, but there for the expression evaluator to find the addresses of the symbols used.
>> 
>> -Nick
> 
> 
> These are the changes I plan to make, and some questions that I have
> 
> a) Define a new contentType for DefinedAtoms to say 'Expression'
> b) Create a new class ExprnAtom derived from DefinedAtom
> c) The expression could also contain various functions that could be set in the expression, how should that be represented ?
I don’t understand this.  I thought expression where like "_foo + 10”.  What do you mean by functions set in expression?

> d) The actual content of the Atom would be a string representation of the expression, that can be used to emit YAML information
Or a parse tree of the expression.

> e)The expression tree needs to be stored into the Native intermediate representation too right ? Store them as atoms ? How to represent constants and functions ?
Well, technically the only places these expressions come from is the command line or linker scripts, so we don’t *have* to have a way to externalize the atoms in yaml or native format.  But, it would be nice to allow that, so that some future C or asm extension would let you create these.

> f) What about lld core ?
> g) Create a new reference type, How does (ExpressionAtom, ExpressionFunction, ExpressionConstant) this sound ?
The expression could be an opaque string except that we need to validate it and we need the resolver to find the symbol names referenced in the expression.  The data structure lld provides is a sequence of References.  The normal data structure for an expression is an (expression) tree.  We can fit the square peg in the round hole by changing the expression to posfix and make the Reference order be the evaluation order.  So the expression "A + B * 2” would become the Reference sequence:
	kind=push-sym 	target=B
	kind=push-const 	addend=2
	kind=multiple
	kind=push-sym	target=A
	kind=add
With this sequence of References we have references to the symbols we need and a simple way to evaluate.  It is also easy to write as a native file format and yaml (just dump the references as I showed above).  The original expression string is lost, but could be recreated if we wanted to write a post-fix to in-fix converter. 


> h) I still need to figure out, what are the ways this symbol can be overridden, if the same symbol is defined in a file, does it override, (Resolver may need to handle it).
Oh my, weak aliases and expression symbols combined!

-Nick





More information about the llvm-dev mailing list