[Openmp-commits] [PATCH] D108046: tsan: Add new annotation functions to promote all accesses to atomic

Sat Aug 14 01:36:03 PDT 2021

dvyukov added a comment.

> Semantically I could also annotate Mutex semantics. But, from my understanding this introduces HappensBefore edges into the analysis, where I don't want to have them. They would hide even more data races than what I try to solve.

I see. This makes sense.

Re performance, what matters more now is performance of the new runtime that is slowly being upstreamed in pieces:
https://github.com/dvyukov/llvm-project/pull/3

This change itself can be ported to the new runtime, but it still adds instructions to fast paths. And more important it turns what's being compile-time constant into a non-compile time constant (we heavily rely on inlining so kAtomic is known to compiler to be 0 statically, so any branches are eliminates and any derived values are known). I think it will have more negative impact on the new runtime we also trace atomic flag:

  ev->isAtomic = !!(typ & kAccessAtomic);

and use it to create vector consts:

  const m128 access_read_atomic = _mm_set1_epi32((typ & (kAccessRead | kAccessAtomic)) << 30);

if kAccessAtomic is not a const, compiler will need to create several globals and load one of the other. Also for non-atomic writes this is 0, so clearing the register.
I also afraid of long-term effect. Say if we add "if (kAtomic)" (some slow path for atomics only, which does not affect normal accesses at all), with this change it becomes problematic (not only it adds a branch, it also affects register allocation, code layout, etc).

So I would like to consider any alternative as much as possible first.

I see that OpenMP reductions are always tied to a single variable (?):
https://www.openmp.org/spec-html/5.0/openmpsu107.html
If at least the modified variables are known, then I think it's possible to do the following: ignore all reduction accesses as it is done now + emit additional (fake) tsan atomic store annotations for the target variables/memory locations.
I think it should have the same effect w/o long-term tsan runtime tax. Unfortunately tsan atomic hooks do the actual atomic operation as well, but for starters you can try __atomic_fetch_add(0).
Will this work? Do I miss anything?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108046/new/

https://reviews.llvm.org/D108046