[PATCH] D54889: Fiber support for thread sanitizer

Dmitry Vyukov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 3 08:17:43 PST 2018


dvyukov added a comment.

Some high-level comments:

1. Do we want to synchronize fibers on switch? If fibers use TLS data, or cooperative scheduling (e.g. mutex unlock directly schedules the next critical section owner), or pass some data from prev fiber to next fiber, in all these cases not synchronizing fibers on switch will lead to false positives. As far as I remember in my old experiments I concluded that it's almost impossible to get rid of false positives it tasks/fibers are not synchronized on switch.

2. This introduces slowdown into the hottest runtime functions (branch and indirection in cur_thread).

I think check_analyze.sh test will fail. Kinda hard to say, because it's broken already at the moment, but this perturbs runtime functions enough:

Before:

    write1 tot 367; size 1425; rsp 8; push 1; pop 5; call 2; load 21; store 13; sh  40; mov  94; lea   4; cmp  39
    write2 tot 373; size 1489; rsp 8; push 1; pop 5; call 2; load 21; store 13; sh  40; mov  95; lea   4; cmp  43
    write4 tot 374; size 1457; rsp 8; push 1; pop 5; call 2; load 21; store 13; sh  40; mov  95; lea   4; cmp  43
    write8 tot 364; size 1429; rsp 8; push 1; pop 5; call 2; load 21; store 13; sh  40; mov  95; lea   4; cmp  39
     read1 tot 409; size 1674; rsp 8; push 1; pop 4; call 2; load 21; store 13; sh  45; mov  97; lea   4; cmp  39
     read2 tot 412; size 1694; rsp 8; push 1; pop 4; call 2; load 21; store 13; sh  45; mov  97; lea   4; cmp  43
     read4 tot 412; size 1694; rsp 8; push 1; pop 4; call 2; load 21; store 13; sh  45; mov  97; lea   4; cmp  43
     read8 tot 404; size 1660; rsp 8; push 1; pop 4; call 2; load 21; store 13; sh  45; mov  97; lea   4; cmp  39
  func_entry tot  42; size  165; rsp 0; push 0; pop 0; call 1; load  4; store  2; sh   3; mov  12; lea   1; cmp   1
   func_exit tot  37; size  149; rsp 0; push 0; pop 0; call 1; load  2; store  1; sh   3; mov   9; lea   1; cmp   1

After:

    write1 tot 367; size 1439; rsp 4; push 1; pop 5; call 2; load 21; store 14; sh  40; mov  93; lea   4; cmp  40
    write2 tot 375; size 1493; rsp 4; push 1; pop 5; call 2; load 21; store 14; sh  40; mov  94; lea   4; cmp  44
    write4 tot 375; size 1471; rsp 4; push 1; pop 5; call 2; load 21; store 14; sh  40; mov  94; lea   4; cmp  44
    write8 tot 365; size 1435; rsp 4; push 1; pop 5; call 2; load 21; store 14; sh  40; mov  94; lea   4; cmp  40
     read1 tot 412; size 1682; rsp 4; push 1; pop 4; call 2; load 21; store 14; sh  45; mov  96; lea   4; cmp  40
     read2 tot 414; size 1698; rsp 4; push 1; pop 4; call 2; load 21; store 14; sh  45; mov  96; lea   4; cmp  44
     read4 tot 414; size 1698; rsp 4; push 1; pop 4; call 2; load 21; store 14; sh  45; mov  96; lea   4; cmp  44
     read8 tot 404; size 1642; rsp 4; push 1; pop 4; call 2; load 21; store 14; sh  45; mov  96; lea   4; cmp  40
  func_entry tot  48; size  197; rsp 0; push 0; pop 0; call 1; load  4; store  3; sh   3; mov  14; lea   1; cmp   2
   func_exit tot  45; size  181; rsp 0; push 0; pop 0; call 1; load  2; store  2; sh   3; mov  11; lea   1; cmp   2

3. I think instead of __tsan_get_current_fiber we could use pointer 0. It prevents things like switching to context of another thread, which won't end well. And just simplifies API surface.

4. What if a fiber calls pthread_exit?

5. Can we combine __tsan_set_fiber_name with __tsan_create_fiber?

6. I think it's better to add flags argument to __tsan_create_fiber/__tsan_switch_to_fiber to not end up with __tsan_create_fiber2 and __tsan_create_fiber3.

7. Do we want to support nested fibers/tasks? Does GCD have nested tasks? I think TBB/OpenMP does. Won't begin/end fiber/task be a more generic API for this? E.g. fiber switching can be expressed as stop running the prev fiber (which switches back to the physical thread context), then start running new fiber.

8. How does this play with setjmp/longjmp? In particular with fiber switching based on setjmp/longjmp? jmp_bufs is part of ThreadState, won't we always try to do longjmp from a wrong ThreadState in such case?

9. Looking at 1, 2, 3, 7 and 8, it seems that we may want to switch some smaller part (subset) of ThreadState. E.g. introduce TaskContext that contains only the part that we need to switch on fiber switches and then have:

ThreadState -> TaskContext (of physical thread, always linked) -> TaskContext (current fiber context)
or:
ThreadState -> TaskContext (of physical thread, always linked) -> TaskContext (of a task) -> TaskContext (of a nested task)
E.g. clock can stay in ThreadState (if we want to synchronize them) which will solve the performance problem, jmp_bufs can stay which will solve part of setjmp problem. This may resolve the hacky_call problem as well.


Repository:
  rCRT Compiler Runtime

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54889/new/

https://reviews.llvm.org/D54889





More information about the llvm-commits mailing list