[LLVMdev] A number of newbie questions

Mon Jan 9 11:49:51 PST 2006

On Mon, 9 Jan 2006, Marcel Weiher wrote:
> I am currently experimenting with LLVM to provide native code 
> compilation services for a project of mine I call Objective-Smalltalk, 
> and so far quite pleased with the results.  I was able to JIT-compile 
> some functions that send Objective-C messages, and now look forward to 
> compiling full methods.

Cool!

> I do have a couple of questions that I haven't been able to answer after 
> looking through what I think is the available documentation:
>
> 1.	Executable size
>
> Executables appear to be gargantuan, a framework that wraps the parts 
> required for the above functionality weighs in at 13 MB fully stripped ( 
> -x ) and at 72 MB (!) with debugging symbols.  Is there any way of 
> significantly reducing this size, at present or planned in the future?

It depends on what you're building.  A release build of LLVM (make 
ENABLE_OPTIMIZED=1, with the results in llvm/Release) is significantly 
smaller than a debug build.  Even with that, however, the binaries are 
larger than they should be (5M?).  Noone has spent the time to track down 
why this is to my knowledge.

> 2.	Global (Function) naming
>
> It appears that I have to give 'functions' a global/module visible name 
> in order to create them, which is a bit odd for the case of compiling 
> methods, as their "name" is really more a function of where they get 
> stuffed in the method table of the class in question, something I might 
> not even know at the time I am compiling the method.  Also these names 
> seem to actually exist in the global function/symbol namespace of the 
> running program, or at least interact with it.

You can use "" for the name.  Multiple functions are allowed to have "" as 
a name without problem.

> I currently just synthesize a dummy name from the address of the object 
> in question, but that's really a bit of a hack.  Is there some way of 
> interacting with LLVM without having to interact with this global 
> namespace?

Yup :)

> 3.	Modules / JITs / functions
>
> As far as I can tell, I need a 'Module' in order to create a function, 
> at least that's the only way I've been able to make it work so far, but 
> I am not really clear why this should be the case.

Yes, Function objects must be embedded into Module objects for the LLVM 
code to be well formed.

> Of course, I also 
> need this Module to create the JIT (or do I?).

Yes, the JIT does need a module to know where to get code to compile from.

> I've now made the Module 
> (or rather my wrapper) a singleton, effectively a global, but I don't 
> feel very comfortable about it.

This should work.  This of it as just a container for the LLVM code you 
are creating.

> Also, I also remember some issues with not being able to create a second 
> JIT later, so it seems like one module per lifetime of a process that 
> wants to do jitting.

I'm not sure what you mean here.

> 4.	Jitted functions / ownership / memory
>
> Once a function is jitted I can get a function pointer to it and call 
> it, that's great.  Can I also find out how long it is, for example if I 
> wanted to write an object file?
> All in all, the jit-result seems to be 
> fairly opaque and hidden.  Is this intentional, or is there more I am 
> missing?

There are ways, but there isn't an elegant public interface for this yet.
For a couple of reasons, it is tricky to JIT code to memory, then wrap it 
up into an object file (in particular, the JIT'd code is already 
relocated).  The start of a direct ELF writer is available in 
lib/CodeGen/ELFWriter.cpp, but it is not complete yet.  It uses the same 
codegen interfaces as the JIT to do the writing.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/