[LLVMdev] MachO and ELF Writers/MachineCodeEmittersarehard-codedinto LLVMTargetMachine

Aaron Gray aaronngray.lists at googlemail.com
Mon Mar 16 06:50:35 PDT 2009


>> Sorry, I disagree actually the MachineCodeEmitter or the
>> 'MachineCodeWritter' does not do any file handling at all. Do look at the
>> code for the MachineCodeWritter and you will see it only writes to memory
>> and if it reaches the end of the allotted memory I believe higher ordered
>> logic reallocates a larget buffer and starts again from scratch. This 
>> could
>> be avoided if they generated fixus for absolute memory references 
>> refering
>> within the outputted code. Then a alloc function could be called before
>> outputting say a 4 byte int and could realloc and copy code and when 
>> finally
>> written the fixups could be applied.
>
> IIRC the memory allocation is done in the MachineCodeEmitter, not the
> higher level (see startFunction and finishFunction). The current
> implementation has startFunction allocate some (arbitrary) reserve
> size in the output vector, and if we the emitter runs out of space,
> finishFunction returns a failure, causing the whole process to occur
> again. This is icky.

Going from the doxygen documentation which I doubt has changed - 
MachineCodeEmitter::(start/finish)Function are both abstract functions, it 
the hidden class JITEmitter that implements these. MachioneCodeEmitter is a 
abstract class but does provide start, end and current prointers.

> It would be far better if the underlying buffer would grow
> automatically (with an allocation method in the base class, as you
> suggested), as code is emitted to it.

Yes. An alloc(4) for example would make sure theres another 4 bytes to be 
written to if not it would copy the whole buffer and allocate say 4/3 more 
memory. The only problem is non PIC (position Independant Code) this would 
require storing fixups, which could probably be done via the relocation 
mechanism.

I want to use a straight 'std::vector<byte>' or reference to for the 
ObjectCodeEmitter.

Any way I think we bear this in mind but should leave this code alone for 
now and come back to it once we have ObjectCodeWritters in place. (This is 
political)

>> 'ObjectCodeEmitter' looks like the right description to parallel the
>> MachineCodeEmitter. Its emitting object code to a data stream (which
>> is an object file section) and not direct to a file.
>
> I can live with that. Before you implement anything, can we try and
> define the responsibilities of the various classes?

This is pritty clear cut read and reread the code. But adding some more 
documentation would help.

> We have MachineCodeEmitter, which is responsible for actually emitting
> bytes into a buffer for a function.

Yep.

> Should it have methods for
> emitting instructions/operands, or should it only work at the byte,
> dword, etc. level?

No this is done by X86CodeEmitter and the other ***CodeEmitters. They are in 
anonymous name spaces but look in 'lib/Target/*' direcories, specifically 
'lib/Target/X*CodeEmitter' and look at X86CodeEmitter.

> ObjectCodeEmitter,  is responsible for emission of object 'files' into
> a memory buffer. This includes handling of all object headers,
> management of sections/segments, symbol and string tables and
> relocations. The ObjectCodeEmitter should delegate all actual 'data
> emission' to the MachineCodeEmitter.

No look at ELFWritter and ELFEmitter.

> ObjectCodeEmitter is a MachineFunctionPass. It does 'object wide'

No ELFWriter inherits from MachineFunctionPass.
And ELFCodeEmitter from MachineCodeEmitter.

> setup in doInitialization and finalizes the object in doFinalize(?).
> Each MachineFunction is emitted through the runOnFunction method,
> which passes the MachineFunction to the MachineCodeEmitter. The

ELFWriter::runOnFunction does nothing.

> MachineCodeEmitter calls back to the ObjectCodeEmitter in order to
> look up sections/segments, add globals to an unresolved globals list
> etc.

No.

> I'm not too happy about the broken encapsulation here. I'd prefer to
> find a better way to model this.

Please, reread the code form SVN, make diagrams, preferable UML and look at 
what is really happening !

Aaron




More information about the llvm-dev mailing list