[LLVMdev] Tool for run-time code generation?

Sun Jul 18 11:02:50 PDT 2010

Martin C. Martin wrote:
>
>
> On 7/17/2010 12:38 PM, Nick Lewycky wrote:
>> Martin C. Martin wrote:
>>>
>>>
>>> On 7/16/2010 10:30 PM, Nick Lewycky wrote:
>>>> Vlad wrote:
>>>>
>>>> Instead, break the chunks of C you would generate into functions and
>>>> compile those ahead-of-time. At run time you use llvm only (no
>>>> clang) to
>>>> generate a series of function calls into those functions.
>>>
>>> Compelling. I hadn't considered that.
>>>
>>> In our application, we have a tree of primitive operations, where each
>>> one calls into its children and returns to its parent. There are various
>>> types of nodes, and we don't know the topology or types of nodes until
>>> runtime (i.e. Clang/LLVM invocation time). Each operation is pretty
>>> simple, so we'd like to inline the children's operations into the parent
>>> where a C compiler would do it.
>>>
>>> Could your technique be extended to that? e.g. precompile to LLVM IR
>>> with calls to a non-existent "node_do_foo()" call, and then replace it
>>> with the specific "childtype_do_foo()" call when we know the type of the
>>> child?
>>
>> Will you know the prototype of the function call in advance? If so, you
>> can do something really simple where you write the C functions with a
>> function pointer parameter. Then at run-time, use
>> llvm::CloneAndPruneFunctionInto to produce the specialized function by
>> giving it a valuemap that maps the Argument* for the fptr to the
>> concrete Function* you want it to call.
>
> Great! I wasn't aware of that, so that's really helpful.
>
>> If you don't know the type of the call you intend to place, the question
>> becomes "why not?" What arguments were you planning to pass it, if you
>> don't know how many arguments it takes in advance? I don't see any
>> reason to doubt that it's possible to do, but I would need more details
>> before I could suggest an implementation.
>
> We're processing large amounts of data, so imagine something like a SQL
> implementation. In one query I might want to JOIN on an int field. In
> another query, I'm JOINing on a pair of fields of type unsigned &
> double. For those two queries, I'd generate:
>
> rightChildType_seekTo(rigtChild, left.getIntColumn3());
>
> vs.
>
> rightChildType_seekTo(rightChild, left.getUIntColumn9(),
> left.getDoubleColumn23());
>
> Is that possible in LLVM?

It's possible, but it's a bit more work. Breaking it down into:

   a = left.getUIntColumn9()
   b = left.getDoubleColumn23()
   rightChildType_seekTo(rightChild, a, b)

shows that you'd have to insert the function calls to your getters then 
insert them as arguments to seekTo. This isn't that hard, it would look 
something like:

   Function *seekTo = ...;  // lookup based on fields
   std::vector<Value *> Args;
   Args.push_back(rightChild);
   for each field:
     Args.push_back(IRBuilder.CreateCall(...));
   IRBuilder.CreateCall(seekTo, Args.begin(), Args.end());

Maybe the solution is to produce some entire functions through the LLVM 
API, where those functions can call out to the leaf functions that you 
wrote in C and compiled to IR.

If your functions are small enough then you should probably just produce 
each one with the IRBuilder. If they're large and you only want to add 
specialized code right in the middle, it may work best to write the 
function in C with a call to a special 'replace_me' function. Then, your 
custom specialization pass would clone the function and find the special 
call using a technique like this:

   http://llvm.org/docs/ProgrammersManual.html#iterate_complex

erase it via. eraseFromParent() then insert the replacement code.

Nick