[llvm-dev] dynamic namespacing of JIT modules?

Thu Sep 13 02:12:18 PDT 2018

On Wed, 12 Sep 2018 at 21:48, Andres Freund <andres at anarazel.de> wrote:
>
> Hi,
>
> On 2018-09-12 12:09:24 +0200, Geoff Levner via llvm-dev wrote:
> > Greetings, LLVM wizards!
>
> Not one of them...
>
>
> > We have an application that uses Clang and Orc JIT to compile and
> > execute C++ code on the fly.
> >
> > The JIT contains any number of LLVM modules, each of which defines a
> > function, plus a "main" module that calls those functions. Several
> > functions may have the same signature, so I need to find a way to
> > resolve them.
> >
> > Originally, I just put each module's code in its own namespace when it
> > was compiled. But now we want to be able to compile them separately to
> > bitcode files and read them later. So at compilation time there is no
> > longer any way to assign a unique namespace to each.
>
> Why not?  If you assign a random uuid, or a sequential number of
> whatnot, that should work.

Yes, that is the solution I am looking into at the moment, actually:
using a UUID to generate a namespace when the module is compiled.
However, that means saving the UUID somewhere; the bitcode is no
longer self-sufficient. I suppose I could create a special global
variable in the module containing the UUID...

> > 2. Assign each module a unique namespace, but don't change the modules
> > themselves: just add the namespace when a function is called from the
> > main module, and modify the JIT's symbol resolver to strip the
> > namespace and look for the function only in the relevant module.
>
> That's kind of what I do for a similar-ish problem in the JIT engine in
> postgres (which uses orcjit).  There multiple dynamically loaded
> extensions can register functions whose source code is available, and
> each of them can have conflicting symbols.  The equivalent of your main
> module generates function names that encode information about which
> module to look for the actual definition of the function, and then does
> the symbol resolution outside of LLVMs code.  I do that both when
> inlining these functions, and when generating funciton calls to the
> external function.

I did try something like that. The problem I ran into is that the
symbol resolver receives mangled function names. It is easy enough to
demangle them there, but hard to mangle names before compiling. Once
you have decoded your function name in the symbol resolver, how do you
generate a mangled name for the actual function you want to resolve
to?

> Not sure if that helps.
>
> Greetings,
>
> Andres Freund

Thanks, Andres.