<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Sep 12, 2013 at 8:19 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank" class="cremed">hfinkel@anl.gov</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb adM"><div class="im">> Speaking from both the Clang and LLVM side: I don't think we know<br>

> what we want to have to put things on the stack, and I am confident<br>

> we won't have it by Chicago. There are big, nasty, hard to answer<br>

> questions in the space of compiler-aided variable sized stack<br>

> allocation. Currently on x86 with LLVM, if the size is variable and<br>

> you have a reasonably fast malloc library, putting dynarray on the<br>

> stack will often be a pessimization. I'm not sure how often we can<br>

> make variable sized stack allocation the right choice, but it will<br>

> at the least require some reasonably significant changes to LLVM's<br>

> optimizer itself.<br>

><br>

><br>

> Even then, I currently expect that small vector, or a standard<br>

> allocator that pulls initially from a fixed-size stack-based pool,<br>

> will be significantly faster, and the only reason for having<br>

> dynarray at all was performance! Considering how hard it is to<br>

> implement, I'm inclined currently to go back to the committee with<br>

> the lesson "it's really hard, and it won't even be faster. can we<br>

> just stick with vector?"<br>

<br>

</div></div>Not to be argumentative ;) -- but I strongly disagree. Good pool allocators are very fast, granted, but tend to add extra memory overhead which, in some environments, is hard to justify (I'm speaking from personal experience here).</blockquote>

<div><br></div><div>But I'm not speaking of pool allocators. I'm speaking of using a fixed maximum amount of stack space, and then falling back to the heap when growing past that bound. Having the fixed bound allows the stack reservation to be non-dynamic.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Having dynamic stack allocation in C++ would be a great step forward.<br></blockquote><div><br></div><div>

I don't think experience with pool-based allocators motivates this, and I think you need actual benchmark numbers based on LLVM, Clang, and modern C++ code to motivate it.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

Furthermore, I don't why this would not be trivial to implement:<br>

<br>

 1. We add an intrinsic, like alloca, called alloca_in_caller. This is only allowed in functions marked always_inline.<br>

 2. To the analysis passes, alloca_in_caller looks like malloc<br>

 3. When the inliner inlines a alloca_in_caller call, it transforms it into an alloca call<br>

<br>

(and that's it I think). Perhaps I'm way off base, but it seems fairly simple. Is there anything else that's needed?<br></blockquote><div><br></div><div>You don't need any of the inline stuff. You just need some builtin which directly produces a dynamically sized alloca instruction and the stacksave/stackrestore pair to manage its lifetime.</div>

<div><br></div><div>But the consequences of using this are pretty significant. The existence of a dynamic alloca impacts the register set available and has other unfortunate side effects. As a consequence of that, the inliner refuses to inline a function with a dynamic alloca into a caller which currently has no dynamic alloca. Solving these fundamental problems is actually somewhat hard, and they all go away if you remove the *dynamic* aspect to the stack allocation.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Thanks again,<br>

Hal</blockquote></div><br><br></div></div>