[LLVMdev] Stack roots and function parameters

Sat Sep 25 09:49:33 PDT 2010

On Sat, Sep 25, 2010 at 3:45 AM, Jon Harrop <
jonathandeanharrop at googlemail.com> wrote:

>  Forgive my top post but I hate Windows. J
>
>
>
> I am surprised you (Talin) say that “we know conservative collectors work”
> because my experience has very much been of them not working. Indeed, if you
> have 400Mb of allocated heap blocks on a 32-bit machine is there not a 10%
> chance of **each** random 32-bit int “pointing” into your heap, i.e. a
> false positive? I just did a simple test here (create an array with 1M
> random 32-bit ints and then allocate blocks n=1..2^16 of size n bytes) and
> Boehm’s conservative GC sometimes takes 13s and other times runs out of
> memory after several minutes. It even complains about the “Repeated
> allocation of very large block (appr. size 65536)”. Is 64kb really “very
> large” today? I think not…
>

I guess in this case, my definition of "works" is that the Boehm collector
appears to be the default choice for 90% of open source projects that
require garbage collection of some kind. However, I agree that the
performance is problematic.

>
>
> Performance is certainly an important reason to implement an accurate
> collector though. However, the performance of a conservative collector is
> surely dominated by the cost of searching to see if each int “points” to
> within any allocated block, a completely unnecessary operation that accurate
> collectors simply do not perform. That overhead will dwarf any doubling up
> of roots on the stack but the main problem with conservative collectors, as
> I say, is not that they are slow but that they leak.
>
>
>
> I agree that eliminating redundant roots is a desirable goal but I did only
> the most basic things for HLVM and was very happy with the results. So I
> think it is premature to value such optimizations. I personally would much
> rather see a mature VM built on top of LLVM.
>

Well, I've managed to reduce the redundant roots to a level that I am
satisfied with, at least for the moment.

The thing about stack roots is that stack space isn't always growable like
the heap, and on some platforms isn't very large. This is especially an
issue in environments with thousands of threads.

>
>
> Cheers,
>
> Jon.
>
>
>
> Talin wrote:
>
> Here's another way to think about the issue: Compare this whole approach to
> stack roots to that of a conservative collector. Since we know conservative
> collectors work, the whole reason for making an accurate collector in the
> first place is efficiency, right? If we couldn't make an accurate collector
> that was more efficient than a conservative collector, then no one would
> bother because it's such a pain.
>
>
>
> However, looking strictly at stack frames, a conservative collector has a
> much more compact stack layout. There's no need to copy anything to local
> variables, because by the time you call into the collector proper,
> everything - every register variable, every function parameter, and so on -
> is already on the stack *somewhere*.
>
>
>
> The stack frame layout dictated by llvm.gcroot, by contrast, is much, much
> fatter. In a typical deeply-nested call stack, there will be many copies of
> the same value on the stack. Let's take the 'this' pointer as an example.
> Say we have some method that calls itself recursively, and we are currently
> 10 levels deep. That means that we have 20 copies of the 'this' pointer on
> the stack: Because each invocation of the method has to push the 'this'
> pointer on the stack as part of the calling protocol (even if it's passed in
> registers, it still has to save the prior value of the register somewhere),
> and in addition we also have to save the 'this' pointer as a stack root. A
> conservative collector, on the other hand, would be able to get by with only
> 10 copies of the 'this' pointer.
>
>
>
> As far as the issue of eliminating redundant stores, I'm pretty sure that
> the optimizers cannot eliminate redundant *roots*. Figuring out the
> minimal set of temporary roots needed for a function call is a non-trivial
> problem (you don't want to just blindly make every pointer argument to a
> function a temporary root, since that just bloats the size of the call frame
> needlessly.)
>
>
>
> My entire goal here is to get a GC that has high performance and
> efficiency. I want my GC to be able to run on everything from a data center
> to a beagle board, and use the minimum resources possible.
>
> --
> -- Talin
>

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100925/bf3dff6a/attachment.html>