[cfe-dev] Clang C asm and LLVM

Sun Jan 24 18:10:44 PST 2010

Hi Patrick,

On 25 Jan 2010, at 01:03, Patrick Moran wrote:

> Hello everyone,
> 
>     I was wondering if there was any way to cause LLVM to interpret
> the asm or __asm__ inline assembly syntax as containing LLVM IR.  That
> is, if I wanted to inject LLVM IR directly into a function, is there
> any way to do so?

No.  It would be a nice feature, which a few people have suggested, but implementing it is not quite trivial. Clang would need to be able to parse the IR, which LLVM can already do, but it would then need to effectively do the same thing as the bitcode linker and the inliner.  Some of this could be done by calling the LLVM code, but a lot would need to be written for clang.

>     In case the context is relevant, I'm writing a compiler for a
> higher-level language and I'm debating between generating C (simpler)
> and generating LLVM IR directly.  Generating C would be preferable,
> but since I will need to emulate try/catch, I need to be able to
> instruct the compiler to use the invoke instruction from the IR
> instead of the call function.  I figured I could simplify this in C by
> wrapping a varargs macro that generates an __asm__ actually performing
> the invoke.
> 
>     Any ideas?

A few things.  First, it seems that you think LLVM's invoke is a bit more magic than it is.  All this does is define an exception handling region.  The llvm selector exception handling intrinsic then fills in the rest, but it is the language's personality function that does the real work.  LLVM just writes DWARF data for each function.  This needs to read the personality function which uses the unwind library to set the program counter.  

You can find a horribly complicated personality function in libstdc++ which follows the complicated rules for C++ exceptions.  I've written one for LanguageKit, which is used for non-local returns and is about as simple as it's possible to be (possibly slightly more simple than it should be).  Writing these is absolutely no fun.  The stuff you need to know is scattered across the LLVM documentation, the Itanium ABI specification and the GCC source code.  All of these are slightly contradictory in fun and exciting ways and you get to find out which one is right by watching your stack get corrupted in the debugger.

It's worth noting that throwing this kind of exception is very expensive.  The 'zero-cost' is a bit misleading.  It means that they cost nothing unless they are used, but when they are thrown you have a call into the unwind library, which calls the personality function for every frame on the stack until it finds the handler.  The personality function generally makes several other function calls to find out whether this is the right frame, so exceptions can quickly become the bottleneck if you are throwing them often.  If you're implementing a language which encourages throwing exceptions, like Ruby or Java, then it's probably better to define your ABI to return two values for each call, one containing the real result and one containing an exception, then test the exception after each call.  This incurs a small overhead for each call, but no additional overhead when you actually do throw an exception (other than, perhaps, a mispredicted branch).

If you want to use exceptions and don't want to write a personality function, then you might consider emitting C++ or Objective-C code, instead of C.  If you emit C++, then you can compile your generated code with one of several different compilers without having to do anything special for exceptions.  

If you want to use the JIT features of LLVM, then you'd be better off emitting IR directly than going via C.  I'd suggest that you look at the IRBuilder class, which makes it very easy to create LLVM instructions.  Between the Module, Function, and IRBuilder classes, you have 90% of what you need to know to write a simple front end for LLVM.  

David

-- Sent from my Cray X1