[LLVMdev] Interpreting stack maps for purposes of precise GC
don.apoch at gmail.com
Tue Jul 30 17:18:29 PDT 2013
I've been using the llvm.gcroot intrinsic combined with the generated
machine stack maps to attempt to do precise GC without the use of a shadow
stack. (This is all research work on a novel language so I have no existing
system to compare against, for the record.)
Most of my test suite is working and tracing stack roots correctly.
However, there seem to be some scenarios where the stack maps do not agree
with reality. I suspect this has to do with SP manipulations during the
execution of some particular piece of code, but it's hard to gather
evidence to corroborate this theory.
I did notice that the stack map is invariant (at the LLVM level) with
respect to which safe point is actually reached within the host function;
i.e. if I have Foo() with two safe points and manipulate the SP between
point A and point B, the stack map becomes bogus because nothing accounts
for the change to the SP.
I'm not sure if this scenario is the actual explanation, but I've also
noticed that occasionally the stack map will just seem wrong; it will mark
certain stack slots as live roots which are outside the bounds of the
actual machine stack frame, for instance. This obviously causes the tracing
phase of the GC to wander off into random bits of memory and (usually)
crash shortly thereafter.
Unfortunately it seems like there is painfully little documentation on how
the stack maps work or are meant to be used, so I was hoping to dig up some
tribal knowledge from the list.
My strategy for interpreting the maps currently looks like this:
if (stack offset <= 0)
pointer to root = start of current stack frame + offset
pointer to root = end of the stack frame "above" current stack frame +
offset + size of frame pointer + size of return address pointer
The reason for this split is that when the offset is negative it seems to
refer to one span of stack space, whereas when it is positive it appears to
be based from a different SP entirely. I found this approach by brute
force, i.e. generating a large number of test cases and mapping out the
stack on paper until the offsets revealed some semblance of a pattern.
However, I'm suspicious about my interpretation of the two cases because of
the aforementioned mis-flagging of roots, but again there seems to be no
documentation whatsoever describing how to actually find a stack address
based on a value in the stack map.
Any/all advice would be much appreciated!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev