[PATCH] Teach TailRecursionElimination (TRE) to handle certain cases of nocapture escaping allocas.

Michael Gottesman mgottesman at apple.com
Wed Jul 10 18:49:34 PDT 2013


After a bunch of spelunking, it seems that the miscompile is not in the previous patch itself but is rather a bug in LLVM in how dynamic allocas are handled. 

My reasoning for concluding as such is that I was able to eliminate the previous version of the bug by using:

1. fastisel on all basic blocks.
2. selectiondag on all basic blocks except for one very fishy basic block.

Nick and I chatted on IRC and decided that given that my patch does not expose said issue that I should file a PR with all of the information from my spelunking and proceed with committing this patch. I will post the PR before I commit.

Michael

On Jul 8, 2013, at 4:28 PM, Michael Gottesman <mgottesman at apple.com> wrote:

>> 
>> I'd like to know what correctness bug was fixed in this patch relative to the patch it's based on. The alternative is to declare that the patch was correct before and merely triggering some other bug in llvm which has since gone away.
> 
> The latter is the wrong approach. I am going to spend some time this afternoon/evening spelunking a bit more through the miscompiled code. I do not want to commit this until I am able to get to the bottom of the issue.
> 
>> 
>> I did some testing with this patch applied and didn't encounter any miscompiles. Michael, please test the patch and ensure it doesn't still miscompile some part of the static analyzer (that's what got it reverted last time?).
> 
> I did before I sent it to the list. Specifically the miscompile occurred in the bootstrapped clang when it was running the static analyzer tests.
> 
> Michael
> 
>> 
>> The patch looks fine to me.
>> 
>> Nick
>> 
>> 
>> 
>> On 5 July 2013 15:33, Michael Gottesman <mgottesman at apple.com> wrote:
>> Hello llvm-commits!
>> 
>> This patch teaches TRE how to understand/handle no capture allocas in order
>> to allow for more call sites to have the tail marker placed on them by the TRE pass.
>> 
>> Without the changes introduced into this patch, if TRE saw any allocas at all,
>> TRE would not perform TRE *or* mark callsites with the tail marker.
>> 
>> Because TRE runs after mem2reg, this inadequacy is not a death sentence. But
>> given a callsite A without escaping alloca argument, A may not be able to have
>> the tail marker placed on it due to a separate callsite B having a write-back
>> parameter passed in via an argument with the nocapture attribute.
>> 
>> Assume that B is the only other callsite besides A and B only has nocapture
>> escaping alloca arguments (*NOTE* B may have other arguments that are not passed
>> allocas). In this case not marking A with the tail marker is unnecessarily
>> conservative since:
>> 
>>   1. By assumption A has no escaping alloca arguments itself so it can not
>>      access the caller's stack via its arguments.
>> 
>>   2. Since all of B's escaping alloca arguments are passed as parameters with
>>      the nocapture attribute, we know that B does not stash said escaping
>>      allocas in a manner that outlives B itself and thus could be accessed
>>      indirectly by A.
>> 
>> With the changes introduced by this patch:
>> 
>>   1. If we see any escaping allocas passed as a capturing argument, we do
>>      nothing and bail early.
>> 
>>   2. If we do not see any escaping allocas passed as captured arguments but we
>>      do see escaping allocas passed as nocapture arguments:
>> 
>>        i. We do not perform TRE to avoid PR962 since the code generator produces
>>           significantly worse code for the dynamic allocas that would be created
>>           by the TRE algorithm.
>> 
>>        ii. If we do not return twice, mark call sites without escaping allocas
>>            with the tail marker. *NOTE* This excludes functions with escaping
>>            nocapture allocas.
>> 
>>   3. If we do not see any escaping allocas at all (whether captured or not):
>> 
>>        i. If we do not have usage of setjmp, mark all callsites with the tail
>>           marker.
>> 
>>        ii. If there are no dynamic/variable sized allocas in the function,
>>            attempt to perform TRE on all callsites in the function.
>> 
>> Please Review,
>> Michael
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130710/52960985/attachment.html>


More information about the llvm-commits mailing list