[LLVMdev] Dataflow analysis based optimisations

Tue Aug 31 06:47:01 PDT 2010

On Tue, Aug 31, 2010 at 6:10 AM, David Given <dg at cowlark.com> wrote:
> On 29/08/10 16:28, Kenneth Uildriks wrote:
> [...]
>> Now the case where function A calls function B, and function B needs
>> to return a pointer to a freshly-allocated object, is fairly common.
>> One way to attack this is to have each function create its own
>> "region" or "memory pool" and destroy it when it returns.
> [...]
>> This
>> scheme will allow any call depth to be handled without a GC heap
>> allocation, as long as no pointer to the object is ever stored in a
>> global or heap object or passed through a capturing function
>> parameter.
>
> If I've understood you correctly, this still requires a run-time memory
> allocation from the appropriate memory pool --- so I still get the
> overhead of running a heap, although I do still get some optimisation
> due to being able to explicitly free the pool rather than relying on the
> garbage collector to do it.

Each function's allocator can start by using (preallocated) stack
space,then switch to allocating from the heap when stack space is used
up.  Each heap allocation can be the same size, making fragmentation
issues go away... your function allocator simply moves a pointer
through
its allocated space and doesn't support any form of delete.

I've used this scheme manually in several C++ projects (usually
without a fallback garbage collector, although I occasionally reach
for boost::shared_ptr), and found it greatly reduced development
effort while improving code quality and performance.  In my own front
end, I'm planning to automate the process much like I've outlined
above, and in the early versions, have it flag an error when it can't
prove the usage to be safe and would need to fall back to a gc
allocation.
>
> I think all I need is the simple case where I lower calls to malloc to
> alloca instructions if the result is only ever passed to nocapture
> functions and is not returned. The function-that-returns-a-pointer case
> (which my language does a lot) may be trivially manageable by relying on
> said functions being inlined; I've yet to get LLVM to inline anything
> yet, so I don't know how well this will work in practice.

As long as the function-that-returns-a-pointer is fairly small, it
should almost always be inlined.  You can slap an "alwaysinline"
attribute on such functions if you like, and that should guarantee it
(outside of a few cases like functions that use setjmp, indirect
branches, or recursion)

Looking through the code, I see that functions with dynamic alloca's
don't get inlined into callers that don't have dynamic alloca's,
unless you use the alwaysinline attribute.

One problem with using inlining this way... if you ever try to expose
a dynamic library with functions like this, clients of the dynamic
library won't even try to inline across module boundaries.  This means
that your library functions will alloca the space they need, and then
return a pointer to its own stack.