[PATCH] Teach TailRecursionElimination (TRE) to handle certain cases of nocapture escaping allocas.
Michael Gottesman
mgottesman at apple.com
Fri Jul 5 15:33:41 PDT 2013
Hello llvm-commits!
This patch teaches TRE how to understand/handle no capture allocas in order
to allow for more call sites to have the tail marker placed on them by the TRE pass.
Without the changes introduced into this patch, if TRE saw any allocas at all,
TRE would not perform TRE *or* mark callsites with the tail marker.
Because TRE runs after mem2reg, this inadequacy is not a death sentence. But
given a callsite A without escaping alloca argument, A may not be able to have
the tail marker placed on it due to a separate callsite B having a write-back
parameter passed in via an argument with the nocapture attribute.
Assume that B is the only other callsite besides A and B only has nocapture
escaping alloca arguments (*NOTE* B may have other arguments that are not passed
allocas). In this case not marking A with the tail marker is unnecessarily
conservative since:
1. By assumption A has no escaping alloca arguments itself so it can not
access the caller's stack via its arguments.
2. Since all of B's escaping alloca arguments are passed as parameters with
the nocapture attribute, we know that B does not stash said escaping
allocas in a manner that outlives B itself and thus could be accessed
indirectly by A.
With the changes introduced by this patch:
1. If we see any escaping allocas passed as a capturing argument, we do
nothing and bail early.
2. If we do not see any escaping allocas passed as captured arguments but we
do see escaping allocas passed as nocapture arguments:
i. We do not perform TRE to avoid PR962 since the code generator produces
significantly worse code for the dynamic allocas that would be created
by the TRE algorithm.
ii. If we do not return twice, mark call sites without escaping allocas
with the tail marker. *NOTE* This excludes functions with escaping
nocapture allocas.
3. If we do not see any escaping allocas at all (whether captured or not):
i. If we do not have usage of setjmp, mark all callsites with the tail
marker.
ii. If there are no dynamic/variable sized allocas in the function,
attempt to perform TRE on all callsites in the function.
Please Review,
Michael
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Teach-TailRecursionElimination-to-handle-certain-cas.patch
Type: application/octet-stream
Size: 15017 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130705/1113d1a7/attachment.obj>
-------------- next part --------------
More information about the llvm-commits
mailing list