Thanks Will for debugging the problem! Indeed, I commented the releasing of the fat lock and it worked without any crashes. That's great news because it looks like we nailed it. The bad news though is that if we don't release a fat lock, it will never be able to get used again. Even worst is if we need to keep a pointer to the associated object. The lock will just keep that object alive forever.<div>

<br></div><div>If it helps, I'm ok with submitting a change that basically comments out the release of a FatLock. When I'll have time, I'll investigate more at which point do we have a race.</div><div><br></div>

<div>Cheers,</div><div>Nicolas<br><br><div class="gmail_quote">On Thu, Dec 1, 2011 at 12:49 AM, Will Dietz <span dir="ltr"><<a href="mailto:wdietz2@illinois.edu">wdietz2@illinois.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im">On Tue, Nov 29, 2011 at 10:01 AM, Nicolas Geoffray<br>

<<a href="mailto:nicolas.geoffray@gmail.com">nicolas.geoffray@gmail.com</a>> wrote:<br>

> Hi Will,<br>

><br>

> On Tue, Nov 29, 2011 at 3:59 PM, Will Dietz <<a href="mailto:wdietz2@illinois.edu">wdietz2@illinois.edu</a>> wrote:<br>

>><br>

>> Hi,<br>

>><br>

>> On both runtimes, I'm seeing the following assertion occur<br>

>> periodically (unpredictably, same code sometimes tickles it other<br>

>> times not)<br>

>><br>

>> j3: /home/will/vmkit-svn/lib/vmkit/CommonThread/ObjectLocks.cpp:305:<br>

>> bool vmkit::FatLock::acquire(gc *): Assertion `obj->header &<br>

>> ThinLock::FatMask' failed.<br>

><br>

> That's because you have too many cores :).... That's a lame answer. The real<br>

> answer is that there is a bug somewhere in the locking implementation. I<br>

> also experience in on my desktop machine (8 cores), but never on my laptop<br>

> (2 cores).<br>

> Nicolas<br>

><br>

<br>

</div>Thanks!<br>

<br>

Inspired by your response, I looked into this a bit and seem to have a<br>

bit more on the issue:<br>

<br>

The problem occurs when we try to deflate a fat lock back to a thin<br>

lock--there's a race on the object headers that makes the guard (that<br>

wants to be "does anyone have a pointer to this lock anywhere") unsafe<br>

(ObjectLocks.cpp:278-279).  Indeed (from my limited understanding of<br>

the literature, gulp) this seems to be a tricky problem in general and<br>

not just a bug of this implementation.<br>

<br>

Given this, and that we probably want locks to work on SMP machines<br>

(well, I sure do! :D), it seems we have at least two solutions:<br>

<br>

* Implement a more complicated locking solution ("tasuki" locks?).<br>

* Disallow deflating of fat locks altogether.  I've done this locally<br>

to great success--a lot of mysterious crashes/assertion failures are<br>

now gone.  Downside is potential performance hit in terms of locking<br>

overhead and lock resource utilization.<br>

<br>

I'm inclined to suggest we just go with the latter, and put the former<br>

on the TODO list... but of course it's up to you, and I'm curious what<br>

your thoughts on all this are :).<br>

<br>

Thanks!<br>

<span class="HOEnZb"><font color="#888888"><br>

~Will<br>

</font></span></blockquote></div><br></div>