libc++: First cut at <dynarray>

Hal Finkel hfinkel at anl.gov
Thu Sep 12 21:04:03 PDT 2013


----- Original Message -----
> 
> On Thu, Sep 12, 2013 at 8:19 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> ----- Original Message -----
> > 
> > 
> > 
> > On Thu, Sep 12, 2013 at 6:23 PM, Howard Hinnant <
> > howard.hinnant at gmail.com > wrote:
> > 
> > 
> > 
> > 
> > Please commit with these changes, and thanks much. Nice job!
> > 
> > 
> > 
> > I'm not sure it is worth it...
> > 
> > 
> > 
> > 
> > Clang team: If we don't have at least some stack support by
> > Chicago,
> > I may recommend removing dynarray for lack of implementation
> > experience. I'm seeking feedback from the clang team on that course
> > of action. If I hear from you that such support is no problem and
> > I'm just being a nervous nanny, I'll back down. But if we're still
> > figuring out how to do this, and no one else has either, then color
> > me nervous nanny. dynarray is not worthy of standardization without
> > stack support.
> > Speaking from both the Clang and LLVM side: I don't think we know
> > what we want to have to put things on the stack, and I am confident
> > we won't have it by Chicago. There are big, nasty, hard to answer
> > questions in the space of compiler-aided variable sized stack
> > allocation. Currently on x86 with LLVM, if the size is variable and
> > you have a reasonably fast malloc library, putting dynarray on the
> > stack will often be a pessimization. I'm not sure how often we can
> > make variable sized stack allocation the right choice, but it will
> > at the least require some reasonably significant changes to LLVM's
> > optimizer itself.
> > 
> > 
> > Even then, I currently expect that small vector, or a standard
> > allocator that pulls initially from a fixed-size stack-based pool,
> > will be significantly faster, and the only reason for having
> > dynarray at all was performance! Considering how hard it is to
> > implement, I'm inclined currently to go back to the committee with
> > the lesson "it's really hard, and it won't even be faster. can we
> > just stick with vector?"
> 
> Not to be argumentative ;) -- but I strongly disagree. Good pool
> allocators are very fast, granted, but tend to add extra memory
> overhead which, in some environments, is hard to justify (I'm
> speaking from personal experience here). Having dynamic stack
> allocation in C++ would be a great step forward.
> 
> Furthermore, I don't why this would not be trivial to implement:
> 
> 1. We add an intrinsic, like alloca, called alloca_in_caller. This is
> only allowed in functions marked always_inline.
> 2. To the analysis passes, alloca_in_caller looks like malloc
> 3. When the inliner inlines a alloca_in_caller call, it transforms it
> into an alloca call
> 
> (and that's it I think). Perhaps I'm way off base, but it seems
> fairly simple. Is there anything else that's needed?
> 
> 
> 
> In case there's any doubt from other people's replies, this approach
> does not work at all.
> 
> 
> If you just want a local variable-length stack array, just use a VLA.
> C++1y has those (though they're called arrays of runtime bound, not
> variable length arrays). That's not really what dynarray is for --
> dynarray is for "all the other cases" where we didn't want to
> support VLAs in the core language because they make things too
> difficult (as struct members, heap allocation, as a template type
> argument, and so on). And the above approach is broken for most of
> those use cases (in that it generates *wrong code*).
> 
> 
> For instance:
> 
> 
> struct foo {
> std::dynarray<int> my_array;
> foo(int k) : my_array(k * 2) {}
> };
> void f() {
> foo my_foo(12);
> }
> 
> 
> Here, we clearly don't want a stack allocation for the dynarray in
> foo::foo(int). Instead, we want one in f(). And if we allocate a
> 'foo' with 'new', we don't want stack allocation at all. And if we
> have a global 'foo', we want a *global* to be allocated. And so on.
> 

Yep, got it :)

> 
> The best approach that I have found is:
> * Implement a general heap-to-stack optimization for LLVM

I'm completely on-board with this.

> * Add metadata on a call that LLVM recognizes as an allocation
> function (eg, _Znwm or malloc) that says "try to promote this to the
> stack, even if the allocation size is not constant"

This seems like a simple addition to the first option, allowing the user to override the cost model, right? I imagine that we might want to keep the malloc calls if the cost of the code in between the malloc and free is high (on the assumption that it is better not to unnecessarily increase register pressure by introducing dynamic allocations). This metadata would just tell the optimizer to do the transformation whenever legal. Is that the idea?

Thanks again,
Hal

> 
> 
> I think I've persuaded Nick that this is both viable and better than
> the alternatives we considered.
> 
> 
> 
> Thanks again,
> Hal
> 
> 
> 
> > _______________________________________________
> > cfe-commits mailing list
> > cfe-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the cfe-commits mailing list