[llvm-commits] Trampoline changes

Duncan Sands baldrick at free.fr
Sat Sep 8 15:03:58 PDT 2007

Currently LLVM has two intrinsics for supporting trampolines,
aka pointers to nested functions: init_trampoline and adjust_trampoline.
In essence, init_trampoline fills a block of memory with executable code,
and adjust_trampoline returns a suitable function pointer for calling that
code.  These patches fold adjust_trampoline into init_trampoline, leaving
init_trampoline as the only intrinsic (it now returns the function pointer
adjust_trampoline used to return).  This requires modifying gcc's nested
function lowering pass, which currently stores the executable code in the
"frame" used to pass the variables of a parent function to its children, and
calls adjust_trampoline whenever it needs the function pointer (the call may
occur in a child function, far from the point at which init_trampoline was
called).  After these patches, the function pointer is stored in the frame
instead, with the code being written to a temporary variable on the parent's
stack.  This means that the adjust_trampoline call happens immediately after the
init_trampoline call, so they might as well be combined into one intrinsic.  The
cost of organizing things this way is that space for one extra function pointer
per trampoline is allocated on the stack.

My motive for doing this was not to simplify LLVM (which is a nice benefit),
but to make it easier to do the optimization in the attached patch tramp_opt.diff
This teaches instcombine how to turn a call to the function pointer returned by
init_trampoline into a direct call to the nested function.  This matters because
calling a function via a trampoline is amazingly expensive (presumably due to
icache flushing - remember that code on the stack gets executed).  Since usually a
pointer to a nested function is taken in order to pass it to some other function, this
optimization is only going to happen if that other function is inlined, so that the
init_trampoline and the call to the nested function pointer occur in the same function.
It is easy to construct testcases for which this works, but sadly I was unable to get
the optimization to occur even once in a trampoline heavy real-world program without
jacking-up the inlining limit hugely (to 100000; 10000 wasn't enough).  Still, I'm
hoping that it may sometimes be useful.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: tramp_intr.diff
Type: text/x-diff
Size: 17810 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070909/405599b7/attachment.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tramp_4_0.diff
Type: text/x-diff
Size: 9843 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070909/405599b7/attachment-0001.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tramp_4_2.diff
Type: text/x-diff
Size: 9849 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070909/405599b7/attachment-0002.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tramp_opt.diff
Type: text/x-diff
Size: 6359 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070909/405599b7/attachment-0003.diff>

More information about the llvm-commits mailing list