[PATCH] A new HeapToStack allocation promotion pass

Mon Oct 7 09:56:57 PDT 2013

----- Original Message -----
> 
> 
> 
> 
> On Mon, Oct 7, 2013 at 8:31 AM, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> If the value is captured, then I need to make sure that there are no
> blocking (synchronizing) calls along the execution path. This
> includes analyzing the function containing the malloc/free. I could
> just reject captured malloc values, but, unless I'm going to reject
> any execution path with any function call, then I need to know if
> the functions will return normally (regardless of whether or not the
> malloc is captured). And since I need to know if the functions on
> the execution path return normally anyway, I might as well look for
> the problematic loops (and atomic/volatile accesses) while I'm at
> it.
> 
> 
> 
> I really, *really* think that a function attribute that is propagated
> up the call graph is the only scalable and effective way to
> implement this. I think doing it in your pass is the wrong design.
> 

Yes; I'm sorry if I was not clear about this. I never planned on doing this analysis with the HeapToStack pass (except, perhaps, that loops in the current function may need to be analyzed there). The analysis of other functions needs to be propagated up the call graph. I completely agree that there is no other scalable (or even correct?) way of doing this.

> 
> 
> 
> I could certainly start with a version that just rejects all function
> calls, but it is not clear to me that the rest of it is particularly
> complicated; and I would not be satisfied with such a solution: it
> would miss a bunch of low-hanging fruit.
> As discussed with you in person, I think that this is essentially
> going to be a prototype anyways, and will need to be modified
> significantly (both in design and implementation) before being
> enabled. So I think a conservative version is fine, and I wouldn't
> even bother enabling it by default. The goal of committing an early
> version is not to start catching some missed optimization
> opportunities but to make subsequent development incremental and
> easier to review.

I agree that this is not the final implementation that we'd like, and I'm happy to call it a prototype. Nevertheless, I think that we need it to be a fully functional prototype. I'm afraid that if we defer the whole question of dealing with function calls until the development of the 'real' version, we'll miss something.

As far whether or not we enable it my default, I think that we should make that decision on its own merits, plus how long we think the refactoring for SROA interaction and the associated SROA work will take. I think that if we can enable it by default, then we should, because this will help us find any remaining bugs. Nevertheless, we're not close to this point yet.

> 
> I still really do not think we have the right design here
> specifically as regards interactions with SROA. I think that's OK
> because we can fix that incrementally.

Agreed.

Thanks again,
Hal

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory