[llvm-commits] Trampoline changes

Fri Sep 28 14:25:30 PDT 2007

>> It is very feasible and would be pretty straight-forward.  Consider
>> this code in C:
>>
>> static void foo(funcptr_t P) {
>>    P();
>> }
>>
>> void a() {
>>    tmp = maketrampoline(somefunc1, somedata);
>>    foo(tmp);
>> }
>> void b() {
>>    tmp = maketrampoline(somefunc2, somedata);
>>    foo(tmp);
>> }
>> void c() {
>>    tmp = maketrampoline(somefunc3, somedata);
>>    foo(tmp);
>> }
>
> this would probably suffice for a decent percentage of cases.   
> However things
> get more complicated once you have multiple trampolines (because of  
> multiple
> nested functions) and multiply nested functions (functions nested  
> inside other
> nested functions).

If you're interested in this, you should start simple and generalize  
out form there.  Handling multiply nested functions and other harder  
cases can build on the basic case when it's implemented and you get  
experience with it.

> The multiple trampoline problem is kind of dumb: the first
> thing that is done to the result of the init_trampoline call is  
> that it is stored
> in a local variable.  Since init_trampoline is IntrWriteMem  
> (because it stashes
> somefunc and somedata in the trampoline) if there are two  
> init_trampoline calls
> in a row then, when the result of the first one is read from memory  
> in order to
> be used, LLVM won't recognise it as the result of an  
> init_trampoline call anymore,
> because it thinks it might have been clobbered by the other call  
> (IntrWriteMem
> causes LLVM to be very pessimistic).  The other problem (nested  
> functions nested
> within other nested functions) is that gcc passes the trampoline  
> pointer to
> child functions in a struct (the frame struct).  So if you want to  
> handle this
> case then you have to track where this struct is being passed to  
> and do more
> complicated analysis.

Sure, but you want to optimize trampolines better in any case :)

>> 3. It takes some function pointer as an argument.
>
> Yes, it becomes much harder if the function pointer is passed in in
> another way, for example in a struct.

Yep, but fortunately other optimizations already hack on these in  
some cases.   You're not going to be able to do the fully general  
case efficiently.  It's probably not worth worrying about until the  
basic case works.

>> 4. all call sites (which you know are direct) pass in a function
>> pointer obtained from llvm.trampoline
>
> And the functions the trampolines were made from have their "nest"
> parameter at the same position (this is the case for code coming
> from gcc, where it is always the first parameter).

I don't know what this means, but I believe you :)

> Also
> 5. Within foo, you need to check that nothing too complicated is
> done to P1, for example it isn't stored in a global or passed to
> some other function.

Actually, worse case, you could just make a trampoline in foo.   
However, you're right that this is only worth doing if there are  
calls to the fn pointer in foo.  OTOH, propagating the trampoline  
creation site down to foo will allow it to be further propagated into  
functions that foo might call.

>> In this case, you can handle it, transforming it as above.  The nice
>> thing about this is that the xform doesn't increase code size, so
>> it's always a win, regardless of how large foo is.
>
> Well, you do have to push another parameter when calling foo, and
> when foo calls P1 :)

Yes, but you'll usually get to nuke the trampoline, saving space :)

>> If you wanted to do this, this should go into the arg promotion pass,
>> which already does much of this analysis.
>
> It's a very nice idea - I'm not sure how many cases it would capture
> though.

Me neither!  I'd suggest building some ada code, running llvm-ld on  
it to internalize it, then looking for calls to trampoline creation  
and seeing how they are used.

-Chris