[LLVMdev] libc dependencies, code generation questions

Thu Jun 7 01:36:57 PDT 2007

Hello,

I'm looking into creating an llvm backend for the Free Pascal  
Compiler (<http://www.freepascal.org>). After reading a bit through  
the documentation and looking at some code generated by llvm-gcc, I  
have a couple of questions:

1) is there a way to specify ranges in the switch statement? Pascal  
supports switch statements (called "case" statements there) which  
look like this:

case <expr> of
   1..1000000: dothis;
   1000001..1000000000: do that;
end;

Generating a switch statement with 10^9 individual entries is not  
really feasible in practice. We can of course map all "large" ranges  
in case statements into equivalent if-statements, but that largely  
defeats the elegance and ease of use of the switch statement for us :)

2) I assume llvm sometimes adds implicit calls to functions in the C  
library, e.g. for llvm.malloc, llvm.free, some floating point  
routines and some others. Is there a policy regarding which llvm  
opcodes may result in C library dependencies and which not? The  
reason I ask is that we try to only depend on stable system  
interfaces (in the sense of interfaces which are the most unlikely to  
break backwards binary compatibility), and on a number of OSes (such  
as Linux) this means using system calls rather than libc.

We have our own alternate implementations of all the functionality  
expressed by the "high level" llvm opcodes, but I don't know if there  
is a mechanism available to redirect these from their (presumed)  
standard libc dependencies to our own routines.

3) we support inline assembler in the same way that Turbo Pascal and  
Delphi did: you just type in code without telling the compiler what  
registers or memory locations this routine clobbers, and the compiler  
thus cannot make any assumptions about them (other than what the ABI/ 
calling convention specifies). As far as llvm is concerned, they  
should be semantically equivalent to calling an external routine  
which was not compiled to llvm ir. Is there generic a way to tell  
this to llvm, or should one simply specify all volatile registers as  
read and clobbered, and the same for memory?

4) to what extent is the front end (i.e., our compiler) responsible  
for code selection and optimization? In other words, should we spend  
a lot of time on converting if-statements to select-based predicates  
and things like this, or will this be done by llvm afterwards anyway?  
What about vectorization? Are there particular kinds of optimizations  
which llvm will probably never be very good at (or which are not  
llvm's focus in the near to middle term), and which thus should  
definitely be done at a higher level?

Thanks,

Jonas