[llvm-dev] RFC: EfficiencySanitizer working set tool

Qin Zhao via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 20 11:24:27 PDT 2016

> I've raised the concern about problems with this approach in
> multithreaded environment, but this particular tool is a good example,
> so we can discuss those problems here.
> Your approach suggests storing the state of 512 cache lines in a
> single shadow cache line. So, if I'm understanding the algorithm
> correctly, parallel accesses to 512 adjacent cache lines from
> different CPUs will cause unnecessary contention on that shadow cache
> line, which will presumably degrade the application's performance.
> Also note that because we don't have atomic bitwise operations,
> updates of shadow bytes will require a CAS loop.
> I'm not sure about the performance impact of the above issues, but
> there might be a tradeoff between the shadow memory scale and the
> slowdown.

1. As Derek said, we will do 64B-2-1B mapping for easier instrumentation.
2. The cache contention in fact is not as bad as you think if we apply the
optimization mentioned by Derek:
"1) Add a shadow bit check before writing each bit to reduce the number of
memory stores.  In our experience this is always a win with shadow memory
By doing that, most shadow access would be read instead of write, so much
less cache contention.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160420/6c971dd2/attachment.html>

More information about the llvm-dev mailing list