[cfe-dev] [RFC] Adding Thread Role Analysis

Fri Jun 28 12:11:48 PDT 2013

TRA and the current thread safety checking seem to us to be different bike sheds, both of which are interesting.  For more, see below.

On Jun 24, 2013, at 2:19 PM, Chandler Carruth <chandlerc at google.com<mailto:chandlerc at google.com>> wrote:

On Fri, Jun 21, 2013 at 1:31 PM, Dean Sutherland <dsutherland at cert.org<mailto:dsutherland at cert.org>> wrote:
It may, in fact, be possible to combine the capabilities of TRA and
the current thread safety checking to produce something very
interesting.  Let's talk about it!  But we believe that TRA offers
sufficient benefits to warrant its inclusion as a feature in addition
to the existing lock-based analysis.  They're similiar in that they
both deal with thread safety, but solve the problem in different ways.

I will freely admit that I've not read all of this thread (it is *very* long, where possible brevity would be helpful to get a wider audience), I wanted to expand on the first email I sent to Aaron in response to this comment.

I think it is fundamentally important to approach contributing this type of work to clang as *incremental* improvements to the existing functionality. It would be very disruptive to the project to have a second analysis system surrounding thread safety. I'm not saying that the existing functionality is perfect or must be exactly preserved, I'm just saying that you should propose new functionality via an incremental path from where we are today.
[SNIP]
I think these issues will (to a certain extent) be just as important as the concerns of basing this on sound academic research, and having a good theoretical model behind the analysis and diagnostics produced. As an example, I think one thing that is actively hurting the community in understanding your proposal is trying to first get an existing community to shift terminology to that of specific research papers, and then describing what you want in those terms. Maybe it would be possible to instead use the existing terminology, or where it isn't good terminology correct the terminology of the existing system before trying to describe new things in terms of it?

The single biggest difference between the existing lock analysis and
thread role analysis is that we're addressing different (but
partly-overlapping) issues.

Lock analysis starts with some data that must be protected from
multi-threaded access.  You then annotate the data to indicate the
specific lock that protects it, annotate methods to indicate how they
interact with the locks.  Analysis involves tracking data references
(which is the really hard part!) and then running the lock-set algorithm
to make sure you never have an unprotected access. This is great stuff!
This ensures that a program has correct and consistent use of locks and
protection of data.

Thread role analysis operates on policy models such as "Only the GUI
event thread may invoke [long-list-of-methods]." This is a common
idiom in many widely-used frameworks, for example AppKit's policy
of always operating from the primary thread.  Many of these policies exist
because fine-grained locking was too difficult, too expensive, or even
impossible—perhaps due to deadlock potential—for the problem at hand.
Other frameworks use a "no concurrency" model, outsourcing concurrency from
the application to the framework. Although this serves to provide thread
confinement of data, it also places restrictions on what *functions* may be
invoked, independent of the *data* those objects may touch. For example: such
code may never start a thread running, grab a thread from a pool, etc.  The
Actor design pattern provides an example of the "no concurrency in the
client" approach.

Obviously, these two analyses are related. Policy-based concurrency
sometimes serves as a proxy for lock-based concurrency. But policy-based
concurrency is a subset of the use-case for TRA. When programmers know
which thread roles may execute a particular function, this provides
some guidance regarding what they can or should invoke from that
function.  For example, an astronomy application we analyzed
compartmentalized things like computation, printing, disk and network i/o,
etc. onto individual threads.  This separation was partly for GUI
responsiveness, performance and maintenance, all of which has limited
relation to locking. The thread roles were purely a useful abstraction for
enforcing developer policy.

We feel that TRA and lock-based analysis are complementary, but
different, systems that can benefit from each other.  There's nothing
inherently disruptive about TRA to the overall architecture of clang
(it's mostly making use of existing mechanisms); it can be implemented
in an incremental fashion for easier community involvement.
Ultimately, we believe there are considerable benefits to integrating
the two systems together, but there does not appear to be enough similarity
to attempt to use the existing lock-based infrastructure as a jumping-off
point.

Dean Sutherland
(with much help from Aaron Ballman)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130628/6ea6219c/attachment.html>