libc++: First cut at <dynarray>

> > Speaking from both the Clang and LLVM side: I don't think we know
> > what we want to have to put things on the stack, and I am confident
> > we won't have it by Chicago. There are big, nasty, hard to answer
> > questions in the space of compiler-aided variable sized stack
> > allocation. Currently on x86 with LLVM, if the size is variable and
> > you have a reasonably fast malloc library, putting dynarray on the
> > stack will often be a pessimization. I'm not sure how often we can
> > make variable sized stack allocation the right choice, but it will
> > at the least require some reasonably significant changes to LLVM's
> > optimizer itself.
> Not to be argumentative ;) -- but I strongly disagree. Good pool
> allocators are very fast, granted, but tend to add extra memory overhead
> which, in some environments, is hard to justify (I'm speaking from personal
> experience here). Having dynamic stack allocation in C++ would be a great
> step forward.
> Furthermore, I don't why this would not be trivial to implement:
>  1. We add an intrinsic, like alloca, called alloca_in_caller. This is
> only allowed in functions marked always_inline.
>  2. To the analysis passes, alloca_in_caller looks like malloc
>  3. When the inliner inlines a alloca_in_caller call, it transforms it
> into an alloca call
> (and that's it I think). Perhaps I'm way off base, but it seems fairly
> simple. Is there anything else that's needed?

In case there's any doubt from other people's replies, this approach does
not work at all.

If you just want a local variable-length stack array, just use a VLA. C++1y
has those (though they're called arrays of runtime bound, not variable
length arrays). That's not really what dynarray is for -- dynarray is for
"all the other cases" where we didn't want to support VLAs in the core
language because they make things too difficult (as struct members, heap
allocation, as a template type argument, and so on). And the above approach
is broken for most of those use cases (in that it generates *wrong code*).

For instance:

struct foo {
  std::dynarray<int> my_array;
  foo(int k) : my_array(k * 2) {}
void f() {
  foo my_foo(12);

Here, we clearly don't want a stack allocation for the dynarray in
foo::foo(int). Instead, we want one in f(). And if we allocate a 'foo' with
'new', we don't want stack allocation at all. And if we have a global
'foo', we want a *global* to be allocated. And so on.

The best approach that I have found is:
 * Implement a general heap-to-stack optimization for LLVM
 * Add metadata on a call that LLVM recognizes as an allocation function
(eg, _Znwm or malloc) that says "try to promote this to the stack, even if
the allocation size is not constant"

I think I've persuaded Nick that this is both viable and better than the
alternatives we considered.

