[PATCH] Teach TailRecursionElimination (TRE) to handle certain cases of nocapture escaping allocas.
Michael Gottesman
mgottesman at apple.com
Fri Jul 5 18:59:32 PDT 2013
Thanks Nick!
Michael
On Jul 5, 2013, at 6:58 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
> Michael Gottesman wrote:
>> Hello llvm-commits!
>>
>> This patch teaches TRE how to understand/handle no capture allocas in order
>> to allow for more call sites to have the tail marker placed on them by the TRE pass.
>
> Also relevant, this is the second time this patch has been picked up and continued. Here's a link to the first time:
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130204/165122.html
> continued http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130211/165176.html
>
> Nick
>
>> Without the changes introduced into this patch, if TRE saw any allocas at all,
>> TRE would not perform TRE *or* mark callsites with the tail marker.
>>
>> Because TRE runs after mem2reg, this inadequacy is not a death sentence. But
>> given a callsite A without escaping alloca argument, A may not be able to have
>> the tail marker placed on it due to a separate callsite B having a write-back
>> parameter passed in via an argument with the nocapture attribute.
>>
>> Assume that B is the only other callsite besides A and B only has nocapture
>> escaping alloca arguments (*NOTE* B may have other arguments that are not passed
>> allocas). In this case not marking A with the tail marker is unnecessarily
>> conservative since:
>>
>> 1. By assumption A has no escaping alloca arguments itself so it can not
>> access the caller's stack via its arguments.
>>
>> 2. Since all of B's escaping alloca arguments are passed as parameters with
>> the nocapture attribute, we know that B does not stash said escaping
>> allocas in a manner that outlives B itself and thus could be accessed
>> indirectly by A.
>>
>> With the changes introduced by this patch:
>>
>> 1. If we see any escaping allocas passed as a capturing argument, we do
>> nothing and bail early.
>>
>> 2. If we do not see any escaping allocas passed as captured arguments but we
>> do see escaping allocas passed as nocapture arguments:
>>
>> i. We do not perform TRE to avoid PR962 since the code generator produces
>> significantly worse code for the dynamic allocas that would be created
>> by the TRE algorithm.
>>
>> ii. If we do not return twice, mark call sites without escaping allocas
>> with the tail marker. *NOTE* This excludes functions with escaping
>> nocapture allocas.
>>
>> 3. If we do not see any escaping allocas at all (whether captured or not):
>>
>> i. If we do not have usage of setjmp, mark all callsites with the tail
>> marker.
>>
>> ii. If there are no dynamic/variable sized allocas in the function,
>> attempt to perform TRE on all callsites in the function.
>>
>> Please Review,
>> Michael
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130705/3c9cfbd9/attachment.html>
More information about the llvm-commits
mailing list