[LLVMdev] Some additions to the C bindings

Thu Oct 8 05:20:57 PDT 2009

On Thu, Oct 8, 2009 at 2:39 AM, Erick Tryzelaar
<idadesub at users.sourceforge.net> wrote:
> On Tue, Oct 6, 2009 at 5:47 PM, Kenneth Uildriks <kennethuil at gmail.com> wrote:
>> On Tue, Oct 6, 2009 at 2:13 PM, Kenneth Uildriks <kennethuil at gmail.com> wrote:
>>
>> LLVMGetAttribute had a bug in it.  Here's the revised version of the patch
>
> Hi Kenneth!
>
> I wouldn't say that I'm the best reviewer, but I've been doing some
> work with the c bindings recently so hopefully I have some idea of
> what I'm talking about :) Comments are inlined:

Thanks.  Let me start by talking a bit about my project.

I'm working on a compiler/language that supports run-time code
generation and compile-time code execution.  Besides the obvious
benefits of easier JITting, I also get the benefits of C++ templates
and metaprogramming without all of the headaches.

To make this work, the compiler actually compiles functions down into
function generators, outputting calls to the LLVM C-bindings that
generate a "regular" function.  The programmer can then either leave
them in that form for run-time JITting, or have the compiler JIT and
execute those function generators in order to get "regular" functions.
 Either or both can be exposed as public functions and left in place
by the optimizer.  The function generator gets its own set of
parameters, and multiple functions with variations can be generated at
compile time or runtime.

He can also put compile-time expressions inside the body of functions,
so that when the function generator runs, the compile-time expressions
are evaluated and used for function generation.  Those compile-time
expressions can use global variables and/or the function generator
parameters..

Anyway, this scheme means that extensive LLVM capability needs to be
available to generated code, since it's the generated code that
creates all of the "regular" functions.  Generated code has a much
easier time calling the C bindings than the C++ API.

>
>
> +/** See the llvm::Use class. */
> +typedef struct LLVMOpaqueUse *LLVMUseRef;
> +
> ...
> +void LLVMReplaceAllUsesWith(LLVMValueRef OldVal, LLVMValueRef NewVal);
> ...
> +/* Operations on Uses */
> +LLVMUseRef LLVMGetFirstUse(LLVMValueRef Val);
> +LLVMUseRef LLVMGetNextUse(LLVMUseRef U);
> +LLVMValueRef LLVMGetUser(LLVMUseRef U);
> +LLVMValueRef LLVMGetUsedValue(LLVMUseRef U);
>
>
> These seem okay to me, but I don't have too much experience with using
> the Use classes. The impression I've gotten from the other developers
> is that the C bindings is really designed to just get data into llvm,
> and any complex manipulations should really be done in C++ passes.
> What's your use case for exposing these more complex manipulations?

I'm using it to support renaming functions and still allowing
generated code to look up those functions by name; basically searching
for all global strings containing the function name, and replacing all
uses of them with uses of the new function name.

I would like to do away with that, though, but I haven't quite managed
to get rid of all cases where LLVMGetNamedFunction is called by
generated code.

Also, I've gotten the impression from other developers that the
C-bindings are considered incomplete and that there is a general
desire to expose more functionality, and eventually all LLVM
functionality, through them.

>
>
> +/* Operations on Users */
> +LLVMValueRef LLVMGetOperand(LLVMValueRef Val, unsigned Index);
>
>
> So how are you using this, since you aren't exposing any of the other
> operand functionality?

This supports the "address-of" operator.  Any Value that is a LoadInst
can have its address taken.  I need the pointer operand of the
LoadInst to get the address Value.

I figured GetOperand was a good starting point, and could support most
of the operand use cases out there.

>
>
> +unsigned long long LLVMConstIntGetZExtValue(LLVMValueRef ConstantVal);
> +long long LLVMConstIntGetSExtValue(LLVMValueRef ConstantVal);
>
>
> I'm not sure about these functions. There really isn't any other way
> to get to the value of any other constant, so why do you need this?

When I've parsed an int literal and put it on my evaluation stack as a
Value, there's a case where I need to get it back as an int.
Specifically, the LLVMBuildExtractValue function requires an int, not
a Constant, to represent the member.  I believe that GEP does as well
when applied to a struct.

>
>
>  /* Operations on composite constants */
> @@ -464,6 +479,7 @@
>  LLVMValueRef LLVMConstVector(LLVMValueRef *ScalarConstantVals, unsigned Size);
>
>  /* Constant expressions */
> +unsigned LLVMGetConstOpcode(LLVMValueRef ConstantVal);
>
>
> This seems okay with me, but there really should be an LLVMInstruction
> enum defined instead of a raw unsigned value. Could you also add a
> LLVMConstExpr that wraps ConstantExpr::get?

That shouldn't be a problem.

>
>
> +int LLVMHasInitializer(LLVMValueRef GlobalVar);
>
>
> Seems fine to me. I can commit this now.
>
>
> +LLVMAttribute LLVMGetFunctionAttr(LLVMValueRef Fn);
> +LLVMAttribute LLVMGetAttribute(LLVMValueRef Arg);
>
>
> I've never really done much with attributes. What are you using this for?
>

In order to do away with include files, I'm supporting importing
modules in bitcode form.  To call a function from an imported module,
I need to put an external into the compiled module, and it really
ought to have the same function and argument attributes as the
original.  And I want to be able to do that while JITting at runtime
as well.