patch: mark all possible tail calls as "tail"

Hal Finkel hfinkel at anl.gov
Wed May 7 09:31:51 PDT 2014


----- Original Message -----
> From: "Hal Finkel" <hfinkel at anl.gov>
> To: "Nick Lewycky" <nlewycky at google.com>
> Cc: "Commit Messages and Patches for LLVM" <llvm-commits at cs.uiuc.edu>
> Sent: Wednesday, May 7, 2014 9:12:10 AM
> Subject: Re: patch: mark all possible tail calls as "tail"
> 
> ----- Original Message -----
> > From: "Nick Lewycky" <nlewycky at google.com>
> > To: "Commit Messages and Patches for LLVM"
> > <llvm-commits at cs.uiuc.edu>
> > Sent: Saturday, May 3, 2014 1:17:43 AM
> > Subject: patch: mark all possible tail calls as "tail"
> > 
> > 
> > 
> > I observed that LLVM fails to mark "tail" on some simple cases. In
> > particular it seems that there is only one pass which does it,
> > TailRecursionElimination, and TRE will skip the entire function if
> > any call argument is derived from an alloca or byval argument.
> > 
> > 
> > 
> > I've implemented a patch which does the full expensive analysis:
> > look
> > at every instruction, make note of allocas and byval arguments and
> 
> Just to be clear, the desire here is to prevent alloca-derived
> addresses from making it into tail-called functions, right? I took a
> quick look at the patch, but I don't see what happens if an
> alloca-derived address is stored into a global variable. In that
> case, the function could obviously still get it. Can you explain
> what happens in this case?

Never mind, it looks like the actual commit (r208017) addressed this.

 -Hal

> 
> Thanks again,
> Hal
> 
> > all values which are potentially derived from those, and then mark
> > calls which never get those as input tail, and calls which get
> > alloca-derived values and could write them into memory "poison" all
> > non-readnone functions which are reachable after they run. This is
> > surely O(n^2) with an expensive "isPotentiallyReachable" call at
> > every step, but I didn't notice any slowdown without any
> > instruments.
> > 
> > 
> > 
> > 
> > Roughly 80,000 additional calls are marked tail in a bootstrap of
> > clang. Sadly this doesn't correlate to actual "jmp" instructions
> > due
> > to what appear to be further optimizer deficiencies.
> > 
> > 
> > I have attached my patch for review. Are there ways this could be
> > done more efficiently? Are there places we're redoing work that
> > could be shared? What is a sensible set of limits on it that will
> > prevent runaway optimizer time? Should it be part of TRE or moved
> > to
> > a separate pass? Land it as-is and we'll find out what breaks
> > later?
> > 
> > 
> > Nick
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-commits mailing list