[PATCH] A new HeapToStack allocation promotion pass

Wed Oct 9 03:03:18 PDT 2013

Daniel Berlin wrote:
>
>
>
> On Sun, Oct 6, 2013 at 11:20 PM, Nick Lewycky <nicholas at mxc.ca
> <mailto:nicholas at mxc.ca>> wrote:
>
>     Hal Finkel wrote:
>
>
>         ----- Nick Lewycky<nicholas at mxc.ca <mailto:nicholas at mxc.ca>>  wrote:
>
>             Hal Finkel wrote:
>
>                 ----- Original Message -----
>
>                     Hal Finkel wrote:
>
>                         ----- Original Message -----
>
>                             (I am not on the list)
>
>                                 This adds a new optimization pass that
>                                 can 'promote' malloc'd
>                                 memory to
>                                 stack-allocated memory when the lifetime
>                                 of the allocation can be
>                                 determined to be bounded by the
>                                 execution of the function.
>
>                                 To be specific, consider the following
>                                 three cases:
>
>                                 void bar(int *x);
>
>                                 void foo1() {
>                                 int *x = malloc(16);
>                                 bar(x);
>                                 free(x);
>                                 }
>
>                                 In this case the malloc can be replaced
>                                 by an alloca, and the
>                                 free
>                                 removed. Note that this is true even
>                                 though the pointer 'x' is
>                                 definitely
>                                 captured (and may be recorded in global
>                                 storage, etc.).
>
>
>                             Hello,
>
>                             this seems to rely on the fact that 'bar'
>                             returns normally, and
>                             thus
>                             that
>                             whenever malloc is executed, free will be as
>                             well. However, bar
>                             could
>                             never return, or return abnormally by
>                             throwing an exception which
>                             will
>                             skip the call to free.
>
>                             void bar(int *x) {
>                             free(x);
>                             throw 42;
>                             }
>
>                             will result in calling free on the stack.
>                             Now if there was a
>                             destructor
>                             calling free in foo1... Do you actually also
>                             consider exceptions
>                             when
>                             you
>                             test that all paths from malloc to the exits
>                             contain a call to
>                             free?
>                             That
>                             would only leave noreturn functions.
>
>
>                         Marc,
>
>                         Thanks!
>
>                         My understanding is that the extra control-flow
>                         from exception
>                         handling should be accounted for by the basic-block
>                         successor/predecessor information (because
>                         calling a function that
>                         might throw uses invoke, and then you see both
>                         the regular
>                         predecessor and the cleanup block as predecessors).
>
>
>                     No, that's only true if the caller handles the extra
>                     control-flow. If
>                     you call (not invoke) a function, and the callee
>                     throws, then the
>                     exception propagates out of the caller, going up the
>                     stack until it
>                     hits
>                     the function that did use invoke.
>
>                     Better is to check "nounwind". However, that is also
>                     not sufficient,
>                     because in llvm nounwind functions may call longjmp.
>
>                     I'll double-check that's correct. Also, you're
>                     right, I should
>                     check
>                     the function's does-not-return attribute also.
>
>                     A function which is marked 'noreturn' is guaranteed
>                     to never return.
>                     A
>                     function not marked 'noreturn' may also not return
>                     -- it may
>                     terminate
>                     the program, longjmp, throw an exception, or loop
>                     infinitely.
>
>
>                 It looks like I could use your 'halting' attribute.
>                 What's the status on that?
>
>
>             I'm not working on it.
>
>             There's an infrastructure problem in LLVM that makes this
>             hard. You want
>             to use the function analyses and the call graph analysis for
>             SCCs.
>             Logically this fits into a CGSCC pass but you can't put it
>             there because
>             those can't depend on function passes. Your options are to
>             either make
>             the whole thing a module pass (bad, doesn't get interwoven
>             as part of
>             the inliner run) or to have the CGSCC pass depend on a
>             really hokey
>             module pass which uses the function pass.
>
>             This is the same problem Chandler is fixing with his PassManager
>             rewrite, with the goal of letting the inliner use the
>             function passes.
>
>             Also, the "is there a loop local to this function" detection
>             in my patch
>             is wrong, it was detecting presence of natural loops, not
>             presence of
>             backedges.
>
>
>         Do you know how your loop detection could be fixed? Would it be
>         sufficient to check that all backedges in the function had an
>         associated Loop, and if any did not, return a conservative answer?
>
>
>     Find backedges (J-edges) by looking at the CFG and checking which
>     successors aren't strictly dominated according to the domtree. For
>     each backedge, look up its Loop. If it doesn't have one, bail. (Note
>     that the blocks may both be members of a loop, but that may be an
>     outer loop incidentally containing both. You need to check that it's
>     actually backedge of this loop, which is true when the branch
>     destination is the loop header block.) Query SCEV for the loop trip
>     count, if it doesn't know, bail.
>
>     (Dan Berlin's answer on the thread is also correct, but forgot to
>     address the issue of whether the found loops are finite. For any
>     irreducible loop, there's no facility in LLVM that determines
>     whether it will terminate.)
>
>
> Right, so as Hal pointed out to me privately, is it even worth doing
> anything other than  bailing in the case of non-natural loops?

In the event of unnatural loops, drop the function and run like hell.

Nick