[LLVMdev] Some question on LLVM design

Fri Oct 22 06:18:00 PDT 2004

Hi everybody,

I'm currently looking at LLVM as a possible back-end to a dynamic 
programming system (in the tradition of Smalltalk) we are developing. I 
have read most of the llvmdev archives, and I'm aware that some things 
are 'planned' but not implemented yet. We are willing to contribute the 
code we'll need for our project, but before I can start coding I'll have 
to submit to my boss a very concrete proposal on which changes I'll make 
and how long they're going to take.

So before I can present a concrete proposal, I have some doubts on the 
design of LLVM and on how some particular constructs should be mapped 
onto its bytecode representation. Please understand that these questions 
are not intended to criticize LLVM, but instead to better my 
understanding of it.

1. Opcodes and intrinsics

Which are the differences between opcodes and intrinsics? How is it 
determined, for an operation, to implement it as an opcode or as an 
intrinsic function?

As I understand it, compilation passes can both lower intrinsics into 
opcodes and also replace opcode sequences, so in the end some of them 
are interchangeable. For example, why is there a store opcode and a 
llvm_gcwrite intrinsic? Couldn't the front-end just produce 
stores/volatile stores and then a compilation pass transform them into a 
write-barrier if necessary?

A possible view of intrinsics could be "operations that don't depend on 
the target architecture, but instead on the language runtime". But then 
wouldn't malloc/free be intrinsics?

2. Stack and registers

As the LLVM instruction set has a potentially infinite number of 
registers which are mapped onto target registers or the stack by the 
register allocator, why is there a separate stack? I would understand 
it, if the stack was more accessible, as a way to implement closures, 
but it has been repeated here that the correct way to do that is to use 
heap-allocated structures, as functions can't access other functions' 
stacks. Is it to signal locations that need to be changed in-place?

3. Control transfer

Why are the control transfer operations so high level when compared to 
actual processors? Usually processors have instructions to jump to a 
concrete location and everything else is managed by the compiler (saving 
into the stack, getting result parameters, etc.) depending on the 
language's calling conventions. In LLVM there's just one way to transfer 
control, and the only proposal I've seen 
(http://nondot.org/sabre/LLVMNotes/CustomCallingConventions.txt) keeps 
this high level. What are the difficulties in having low level transfer 
control operations, with explicitly managed arguments, saving registers, 
etc?

Well, that's all for now. Thanks in advance,

Marc Ordinas i Llopis | Tragnarion Studios