[PATCH] D54889: Fiber support for thread sanitizer

Yuri Per via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 17 07:15:41 PST 2018


yuri added a comment.

In D54889#1326850 <https://reviews.llvm.org/D54889#1326850>, @dvyukov wrote:

> In D54889#1325232 <https://reviews.llvm.org/D54889#1325232>, @yuri wrote:
>
> > Hi Dmitry,
> >  I will try to answer your questions
> >
> > In D54889#1316889 <https://reviews.llvm.org/D54889#1316889>, @dvyukov wrote:
> >
> > > 1. Do we want to synchronize fibers on switch? If fibers use TLS data, or cooperative scheduling (e.g. mutex unlock directly schedules the next critical section owner), or pass some data from prev fiber to next fiber, in all these cases not synchronizing fibers on switch will lead to false positives. As far as I remember in my old experiments I concluded that it's almost impossible to get rid of false positives it tasks/fibers are not synchronized on switch.
> >
> >
> > In case of single-threaded scheduler, it is worth to synchronize fibers on each switch. Currently I do it in client code using _tsan_release()/_tsan_acquire(), but it is possible to add flag for _tsan_switch_to_fiber() that will do it.
> >
> > In case of multithreaded scheduler, client probably has own sychronization primitives (for example, custom mutex), and it is client's responsibility to call corresponding TSAN functions.
>
>
> I think that synchronization will be exactly in the wrong place.
>  Consider, a thread pushes a fiber for execution on a thread pool. It locks a mutex and adds the fiber onto run queue. Then a thread pool thread dequeues the fiber from the run queue under the same mutex. Now the first thread and the thread pool thread are synchronized. Now the thread pool switches to the fiber context, no synchronization happens, so the fiber is _not_ synchronized with the submitting thread. This means that any passed data will cause false positives.
>  Even if the fiber is just created, in the above scheme even reading fiber entry function arguments (and even just writing to the stack) will cause false positives.
>
> If fiber switch will synchronize parent thread and the fiber, then we will get the desired transitive synchronization.
>
> Looking at the race reports produces by the qemu prototype, it seems there are actually all kinds of these false reports. There are false races on TLS storage. There are races on fiber start function. There are numerous races on arguments passed to fibers.
>
> You said that for your case you need the synchronization, and it seems that it's also needed for all other cases too. I think we need to do it. People will get annotations wrong, or don't think about acquire/release at all, and then come to mailing list with questions about all these races. If it will prove to be needed, we can always add an opt-out of synchronization with a new flag for the annotations.


I added default synchronization and flag to opt-out. It would be great if someone can check this version with QEMU.


Repository:
  rCRT Compiler Runtime

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54889/new/

https://reviews.llvm.org/D54889





More information about the llvm-commits mailing list