libc++: First cut at <dynarray>

Thu Sep 12 20:57:36 PDT 2013

----- Original Message ----- 
> 
> On Thu, Sep 12, 2013 at 8:33 PM, Marshall Clow <
> mclow.lists at gmail.com > wrote:
> 
> 
> 
> {
> return new dArray ( 6 );
> }
> 
> These six longs cannot be put on the stack. When would they be
> deallocated?
> 
> 
> 
> Sorry, yes, I always forget about the heap-allocated-dynarray case.
> 
> 
> This just seems so fundamentally broken. How is it reasonable to heap
> allocate a class whose only purpose is to optimize through use of
> the stack? It just doesn't work.
> 
> 
> 
> (There's a third case, where a dynamic array is a member variable in
> a class or struct, but I think these two cover it well enough)
> 
> In the discussion with Nick and Richard, we came up with the idea of
> a compiler intrinsic that is a hint, (i.e try to put this allocation
> on the stack), and then, if during the compiler's optimization pass,
> it decides that it has enough information (i,e can see the whole
> lifetime of the dynarray, etc), then it can change the allocation
> (call to ::operator new) to a call to something like alloca. [ Same,
> obviously for deallocation ]
> Yes, this is one way we can do things. And yet, I contend the
> *correct* way to do this is without any hint at all and instead
> having the compiler simply do this for boring old std::vector *IF*
> it can prove a constant upper-bound on size.

This is an optimization that we'd like to have regardless. That having been said, would the simpler semantics of a dynarray make this easier in practice? [I don't see why it would, but I've tried analyzing inlined vector constructor code myself]. The other aspect of this, perhaps just as important, is user expectation. The limited semantics of dynarray helps enforce a set of restrictions intended to keep the array on the stack, and this should help keep other programmers/tools from accidentally screwing it up later.

> Otherwise, we're still
> stuck in the pit of having dynamic allocas which are simply not well
> supported in LLVM today.
> 
> Now, it is possible we could go through and build the necessary
> infrastructure to fully optimize even in the presence of dynamic
> allocas, but that is not the compiler we have today and I'm not
> convinced it is the compiler anyone has today.

I understand what you're saying, but I actually don't think that this is a major problem. I see these used a lot in HPC, normally like this:

void foo(int n, ...) {
  double a[n], b[n], c[n];

  for (...) {
    // loops which use a, b, c as intermediate arrays or whatever
  }

  some_function(a, n);

  ...
}

the fact that LLVM would not inline foo is fairly irrelevant, and, FWIW, not all architectures are quite as register-pressure dominated for GPRs as x86 ;) In any case, I'd be happy to help tune the inlining heuristics to better deal with dynamic stack allocation.

Thanks again,
Hal

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory