[LLVMdev] Runtime optimization of C++ code with virtual functions

Mon Jun 25 15:56:42 PDT 2007

On Thursday 21 June 2007 13:57, Stéphane Letz wrote:

> >> I understand that the disassemblying portion need to be rewritten. Is
> >> there anything else that would prevent this approach from working?
> >> Again, haven't looked into LLVM yet, so I can immagine there might be
> >> problems in describing physical registers in the IR and at some point
> >> stuff must be exactly where the pre-existing code expects it. I don;t
> >> want to take your time, but if you could elaborate a bit it might
> >> prevent me from going down the wrong path.
> >
> > This should work, I don't expect you to run into any significant
> > problems.
> > When you're rewriting the LLVM IR for the indirect call, you can just
> > replace it with a direct call to the native code.
>
> Compared to template based specialization this would have the
> advantage of being dynamic.

But templates have the advantage of being able to be inlined.  This is a much
more important transformation than simply converting an indirect call to a
direct one, especially on modern implementations like Core or Opteron.

You approach is going to make inlining very difficult, I think.  Not that 
there's a whole lot that can be done about it, given the binary translation
going on.  For example, how would you inline calls to send() where transport()
has been inlined (assuming send() wasn't already inlined)?

Is there some other set of transformations you have in mind to generate more
efficient code for transport() at run time?  Partial evaluation might be 
interesting, but that's applicable whether or not transport() is virtual.  In
fact, virtual call resolution is a form of partial evaluation where the 
run-time constants are the "this" pointer and its most-derived subclass type.

If you really want to generate fast code, it might be worth your while to 
implement more general partial evaluation and specialization.  If you make it
general enough, you'll get run-time virtual call resolution "for free."

You might also have a look at the Self papers.  The Self team did a lot of 
work on runtime optimization of dynamic dispatch.  IIRC they also did some
partial evaluation work.

                                           -Dave