[PATCH] A new HeapToStack allocation promotion pass
Hal Finkel
hfinkel at anl.gov
Mon Oct 7 08:31:42 PDT 2013
----- Original Message -----
> On Oct 5, 2013, at 3:29 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> >>> Yep, Nick also pointed this out; thanks for confirming!
> >>
> >> No problem. Here's one with longjmp().
> >
> > It seems in general that we have two situations to deal with:
> >
> > 1. If the pointer (or some alias) is captured, and we (or some
> > function we call) has some indefinite loop (including the use
> > operating-system-assisted synchronization primitives), then some
> > other thread might free the memory. Maybe I could call safe
> > functions in this regard 'non-blocking'?
>
> Have you considered changing your approach, to base it on nocapture
> instead?
>
> From one of your emails, you mentioned that you're mostly interested
> in the template case where the callee graph is pretty well known.
> Given that, you should be able to turn this into a simple function
> pass that doesn't require interprocedural knowledge: only allow it
> to be passed to no-capture calls. This is a very simple form of
> escape analysis.
Yes, I thought about that. Unfortunately, we may be too far down the rabbit hole already ;)
If the value is captured, then I need to make sure that there are no blocking (synchronizing) calls along the execution path. This includes analyzing the function containing the malloc/free. I could just reject captured malloc values, but, unless I'm going to reject any execution path with any function call, then I need to know if the functions will return normally (regardless of whether or not the malloc is captured). And since I need to know if the functions on the execution path return normally anyway, I might as well look for the problematic loops (and atomic/volatile accesses) while I'm at it.
I could certainly start with a version that just rejects all function calls, but it is not clear to me that the rest of it is particularly complicated; and I would not be satisfied with such a solution: it would miss a bunch of low-hanging fruit.
I am somewhat concerned about compile-time impact if looking for loops with an indeterminate iteration count involving running SE on everything, but as far as I can tell, I'd actually get petty far by just rejecting all loops (that would miss much less of the low-hanging fruit). So if we need a fall-back position, I'd rather go there. Maybe with some additional use-case-driven logic on top of that if the overhead can be kept sufficiently low.
Thanks again,
Hal
>
> -Chris
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits
mailing list