[PATCH] A new HeapToStack allocation promotion pass

Sat Oct 5 20:47:07 PDT 2013

On Sat, Oct 5, 2013 at 6:53 PM, Hal Finkel <hfinkel at anl.gov> wrote:

>
> ----- Nick Lewycky <nicholas at mxc.ca> wrote:
> > Hal Finkel wrote:
> > > ----- Original Message -----
> > >> Hal Finkel wrote:
> > >>> ----- Original Message -----
> > >>>> (I am not on the list)
> > >>>>
> > >>>>> This adds a new optimization pass that can 'promote' malloc'd
> > >>>>> memory to
> > >>>>> stack-allocated memory when the lifetime of the allocation can be
> > >>>>> determined to be bounded by the execution of the function.
> > >>>>>
> > >>>>> To be specific, consider the following three cases:
> > >>>>>
> > >>>>> void bar(int *x);
> > >>>>>
> > >>>>> void foo1() {
> > >>>>> int *x = malloc(16);
> > >>>>> bar(x);
> > >>>>> free(x);
> > >>>>> }
> > >>>>>
> > >>>>> In this case the malloc can be replaced by an alloca, and the
> > >>>>> free
> > >>>>> removed. Note that this is true even though the pointer 'x' is
> > >>>>> definitely
> > >>>>> captured (and may be recorded in global storage, etc.).
> > >>>>
> > >>>> Hello,
> > >>>>
> > >>>> this seems to rely on the fact that 'bar' returns normally, and
> > >>>> thus
> > >>>> that
> > >>>> whenever malloc is executed, free will be as well. However, bar
> > >>>> could
> > >>>> never return, or return abnormally by throwing an exception which
> > >>>> will
> > >>>> skip the call to free.
> > >>>>
> > >>>> void bar(int *x) {
> > >>>> free(x);
> > >>>> throw 42;
> > >>>> }
> > >>>>
> > >>>> will result in calling free on the stack. Now if there was a
> > >>>> destructor
> > >>>> calling free in foo1... Do you actually also consider exceptions
> > >>>> when
> > >>>> you
> > >>>> test that all paths from malloc to the exits contain a call to
> > >>>> free?
> > >>>> That
> > >>>> would only leave noreturn functions.
> > >>>
> > >>> Marc,
> > >>>
> > >>> Thanks!
> > >>>
> > >>> My understanding is that the extra control-flow from exception
> > >>> handling should be accounted for by the basic-block
> > >>> successor/predecessor information (because calling a function that
> > >>> might throw uses invoke, and then you see both the regular
> > >>> predecessor and the cleanup block as predecessors).
> > >>
> > >> No, that's only true if the caller handles the extra control-flow. If
> > >> you call (not invoke) a function, and the callee throws, then the
> > >> exception propagates out of the caller, going up the stack until it
> > >> hits
> > >> the function that did use invoke.
> > >>
> > >> Better is to check "nounwind". However, that is also not sufficient,
> > >> because in llvm nounwind functions may call longjmp.
> > >>
> > >> I'll double-check that's correct. Also, you're right, I should
> > >> check
> > >> the function's does-not-return attribute also.
> > >>
> > >> A function which is marked 'noreturn' is guaranteed to never return.
> > >> A
> > >> function not marked 'noreturn' may also not return -- it may
> > >> terminate
> > >> the program, longjmp, throw an exception, or loop infinitely.
> > >
> > > It looks like I could use your 'halting' attribute. What's the status
> on that?
> >
> > I'm not working on it.
> >
> > There's an infrastructure problem in LLVM that makes this hard. You want
> > to use the function analyses and the call graph analysis for SCCs.
> > Logically this fits into a CGSCC pass but you can't put it there because
> > those can't depend on function passes. Your options are to either make
> > the whole thing a module pass (bad, doesn't get interwoven as part of
> > the inliner run) or to have the CGSCC pass depend on a really hokey
> > module pass which uses the function pass.
> >
> > This is the same problem Chandler is fixing with his PassManager
> > rewrite, with the goal of letting the inliner use the function passes.
> >
> > Also, the "is there a loop local to this function" detection in my patch
> > is wrong, it was detecting presence of natural loops, not presence of
> > backedges.
>
> Do you know how your loop detection could be fixed?

Detect irreducible loops :)

> Would it be sufficient to check that all backedges in the function had an
> associated Loop, and if any did not, return a conservative answer?
>

This would work.   If you want a little better, you can also use SCC based
approaches to detecting irreducible loops (see GCC's
mark_irreducible_loops), or DJ-graph based approaches (augmented dominator
trees).    They should produce exactly equivalent results, last i looked.

They should also properly find all loops, both reducible and irreducible.

Neither  are particularly difficult to implement.
http://moss.csc.ncsu.edu/~mueller/ftp/pub/mueller/papers/toplas2184.pdf(see
part 5) for the dj-graph version (which you could easily modify to not
make not require an explicit dj graph structure, LLVM has all the info
anyway)

GCC's is a little less documented, because Zdenek, who implemented it, is a
graph expert, but if you read the paper first, you can see both approaches
really do the same thing.

I have no concept of how often irreducible loops still occur in real code
in a way that really matters for your analysis, so i can't opine on whether
this is worth the time vs your suggested solution :)

--Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131005/1bf6e9f0/attachment.html>