[cfe-dev] [RFC] Adding Thread Role Analysis

Mon Jun 24 11:16:15 PDT 2013

Thank you for the long explanation.  However, based on what you've said, I
do not think that your approach warrants a completely separate analysis,
along with a completely separate set of annotations.

In particular, I believe that "lock" and "role" are essentially the same
thing.  In both cases, the analysis is marking functions and data
structures as being "protected" by a particular lock/role, and ensuring
that a thread must hold/have the lock/role before calling protected
functions, or accessing protected data.  Most of the analysis logic
(tracking lock/role sets, dealing with pointer aliasing, etc.) will thus be
exactly the same.  It makes no sense to duplicate both annotations and
logic.

In particular, I foresee a time when someone might wish to use both locks
*and* roles in the same program.  In that case, it would be much better to
have a single analysis that understood both.

That said, you did point out some useful missing features, which are:

> We support more complex boolean expressions over thread roles. The
> existing Clang annotations cannot express multiple combinations of
> such properties.  For example, ((A & B & !C) | (D & !E)).

As you mention, the existing analysis does not support this case.  However,
this functionality would not be hard to add -- it's simply a boolean
constraint on allowable lock sets.  If you feel it's valuable, then let's
do that!

> TRA (for code) plus a sufficient effects and/or alias analysis (for
> data) can combine to support discovery of data that *should* be locked
> (but currently is not).   .... Thus, multiple tricks to reduce the number
of
> annotations to be written (see the paper).  ...

I would classify this under the heading of "annotation inference".  It
would, indeed, be extremely useful to have a reliable way to infer the
appropriate thread annotations.  There is an existing body of literature on
thread inference for lock annotations, but it is currently completely
unimplemented in clang; that would be a useful contribution.

My general feeling is that role inference and lock inference will end up
being very similar algorithms.  As you point out, however, there is a
difference in scale -- there are at most a handful of roles in a program,
while there may be many locks.  Thus, I am willing to be convinced that
your inference algorithm for roles is faster and/or more complete than the
corresponding version for locks, which would be reason for a separate
implementation.

The thing to be careful of here is that we will need to separate inference
from checking, because there are three separate use cases:

* Case 1: all annotations in source files.  (current system)
* Case 2: inference tool will add annotations to source files.
* Case 3: inference tool will write annotations to sidecar files, to be
used by separate analysis tool.

On Fri, Jun 21, 2013 at 1:31 PM, Dean Sutherland <dsutherland at cert.org>wrote:

> On Jun 20, 2013, at 7:12 PM, Delesley Hutchins <delesley at google.com>
> wrote:
> >
> > As Chandler mentioned, clang already has an existing thread safety
> analysis,
> > which is based on type systems for static race detection, which I believe
> > you are familiar with:
> >
> > http://users.soe.ucsc.edu/~cormac/papers/pldi00.pdf
> > http://users.soe.ucsc.edu/~cormac/papers/toplas06-safe-locking.pdf
> >
> > The clang annotations can be found at:
> > http://clang.llvm.org/docs/LanguageExtensions.html
> >
> > In a nutshell, the existing analysis allows a data member to be
> annotated as
> > being guarded by a particular lock.  Similarly, a function can be
> annotated
> > as requiring a particular lock to be held.  The analysis then tracks the
> set
> > of locks that are known to be held at every program point, and issues
> > warnings when a data member is accessed or a function is called without
> > holding the appropriate lock.
>
> We are familiar with these techniques, which are fundamentally similar to
> the
> lock analysis developed by our Fluid-group colleague Aaron Greenhouse (see
> the
> PPoPP paper's bibliography for references).  That said, we feel that TRA
> has
> additional benefits over the existing functionality.  The key difference is
> this: the Clang annotations, along with the vast majority of other work on
> static analysis for thread safety, are focused on lock-based concurrency.
> Thread role analysis is focused on NON-lock concurrency, which is a rather
> different thing.  However, we realize that there can be overlap between
> the two
> techniques, for example with regards to enforcing reader-writer behavior.
>
> Thread role analysis doesn't concern itself with locks, which provides
> benefits over the various lock-based analyses which don't do anything
> when there's no locking involved.  The issues addressed are rather
> different.  And, it turns out, many widely-used frameworks impose
> thread usage policies on their clients.
>
> The easiest example to explain involves the typical threading model
> used by most modern GUIs (including at least AWT/Swing and SWT in the
> Java world, the X windowing system, and many others). Specifically:
>
> * There is at most one GUI event thread at a time. In some
> implementations the identity of this thread is fixed. In others, the
> identity of the GUI event thread may change from time to time.
>
> * The GUI does not use locks to protect its data structures. (Note:
> this is from the *client* perspective.  There may be interesting
> locking *inside* the GUI implementation, but this is typically hidden
> from clients.) This choice was originally made for performance
> reasons; it has persisted due to the clash between the need for
> lock-ordering (to avoid deadlock) and the fact that events flow
> bi-directionally between the GUI(and the user) on one side, and the
> program and system on the other. This makes it extremely challenging
> (and perhaps impossible) to create a single consistent lock-ordering.
>
> * The GUI must avoid race conditions, even though its data is
> unprotected by locks. The typical solution is thread confinement, to
> wit: *only* the GUI event thread may touch the GUI's data structures.
> That's a thread usage policy, but not one that's easily expressable
> using lock-based annotations.  The statements below are consequences
> of that policy.
>
> * Code executing on threads other than the GUI event thread must never
> invoke GUI methods (actually, most GUI methods; see below). Getting
> this wrong breaks the necessary thread confinement. Note that this is
> a transitive property which must be true for all code executing on
> non-GUI threads.
>
> * GUI callback methods are invoked by the GUI on the GUI event thread,
> even though it is client-written code.  Consequences include:
> ** Code invoked from GUI callbacks must transitively invoke only
> methods that can legitimately execute on the GUI event thread.
> ** Data structures that are part of the client's UI implementation
> inherit the thread-confinement policy from the GUI.
> ** GUI callback methods should never be invoked from any thread
> *other* than the GUI event thread.
> ** All of these sub-items must be transitively true for all code
> reachable from GUI callback methods.
>
> * Client code should avoid blocking the GUI event thread, because that
> hangs the UI. Implications of this include:
> ** Don't risk waiting on locks that may be held by another
> thread. File and network I/O are probably a bad idea. Etc.
> ** Don't kick off lengthy computations while executing on the event
> thread.
> ** All of these require the ability to "know" which code may be
> executed by the event thread.
> ** Once again, all of these sub-items must be transitively true for
> all code reachable from GUI callback methods.
>
> * There are a few GUI functions that may be legitimately invoked from
> any thread. This is typically a short list, at least when compared
> with the total set of GUI framework methods. Typical examples include
> adding and removing event listeners, requesting that some action be
> taken on the GUI event thread (Swing calls this method
> "invokeLater(...)", other GUIs have other names for it), etc.
>
> All of these properties are entirely policy-based.  The typical
> lock-based concurrency analysis systems lack the concepts needed to
> express or enforce them.  The concepts of thread roles and constraints
> regarding which roles may execute which code (or touch which specific
> data items, if data analysis is supported) enable simple expression
> and enforcement of policies like these.
>
> > The existing analysis tracks locks, not threads or roles, so
> > concepts like the "GUI thread" cannot be directly
> > expressed. However, it's not hard to mimic this functionality -- the
> > GUI thread acquires a lock named "GUI_lock" when it starts, and
> > releases it when its done, and all code and data that should be
> > GUI-only is annotated as requiring that lock.  (A lock in clang is
> > simply a named C++ object; it does not need to implement an actual
> > lock.)  Switching of roles can similarly be modeled as releasing one
> > lock and acquiring another.  So at first glance, it seems to me that
> > lock-based analysis is roughly similar to the analysis that you
> > propose, and may in fact be significantly stronger; the "related
> > work" section in your paper does not discuss the differences in much
> > detail.
>
> It may be possible to squeeze TRA into the current thread safety
> analysis by using "notional" locks in place of thread roles as you
> suggest.  In what follows, a lock name (e.g., "A") means that the lock
> is required; the negation of a lock name (e.g., "!A") means that it is
> excluded.  The existing Clang annotations support requiring a set of
> locks (e.g., (A & B & C) ) and also forbidding a set of locks
> (e.g.,(!B & !C)).
>
> However, when you say:
>
> > and all code and data that should be GUI-only is annotated as
> > requiring that lock.
>
> If this statement requires user-written annotations for all such
> methods, the required annotation effort will make adoption exceedingly
> difficult for most practicing programmers.  One of the benefits of TRA
> is the application of well-founded techniques to avoid the need for
> most user-written annotations.  Perhaps some of these techniques could
> be brought to bear on the existing thread safety analysis.
>
>
> TRA offers some additional capabilities:
>
>  * We support more complex boolean expressions over thread roles. The
> existing Clang annotations cannot express multiple combinations of
> such properties.  For example, ((A & B & !C) | (D & !E)).  Before you
> ask, although such combinations would be nonsensical for locks, they
> really do show up for thread roles in production code.
>
> * TRA allows statements that certain thread roles are mutually
> exclusive (a GUI thread can't be a Compute thread, and vice
> versa). The Clang annotations lack support for this property.
>
> * TRA (for code) plus a sufficient effects and/or alias analysis (for
> data) can combine to support discovery of data that *should* be locked
> (but currently is not), as well as demonstrating that notionally
> thread-confined data is, in fact, thread confined.  If your existing
> effects or alias analysis is strong enough, that's a really exciting
> capability.  By comparison, the existing thread safety analysis
> *starts* when you decide that you need to protect some data with a
> lock.
>
> * TRA is strongly focused on reducing the user effort required to get
> results.  Thus, multiple tricks to reduce the number of annotations to
> be written (see the paper).  For example, our inference-based cross-TU
> analysis avoids the need to annotate each method in a chain of calls
> with the equivalent of a shared_locks_required() annotation. See also
> the discussion in the paper of analysis of partially-annotated code,
> "scoped promises" (NOT related to your scoped_lockable annotation),
> and so on.
>
> Our experience with extensive field trials showed that pragmatic
> adoptability requires all of:
> * Sound results. The existing thread safety analysis is certainly
> capably of providing this.
> * A very smooth incremental adoption path. I believe that the existing
> thread safety analysis provides this, for the properties it
> addresses. At a minimum, you can work through an existing code-base
> one lock at a time.
>
> * Very low programmer effort required to produce useful results.  My
> past experience with lock-based analyses suggests that this is mixed.
> The purely lock-based part is mostly there.  O-O effects and/or alias
> analysis, however, mostly fails this test.  Particular challenges
> arise due to the occasional need to prove properties like exclusive
> ownership of heap data (for some cases), or more-precise effects than
> "touches the heap" (for other cases).
>
> An important issue to be aware of is that the portion of a program
> that needs to know about any particular lock (and the data item
> protected by it) is typically a very small fraction of the total
> program -- perhaps only a handful of methods.  Writing an annotation
> for each method in that handful is rarely a substantial
> burden. However, interesting programs may contain large numbers of
> locks.
>
> Thread roles have very different properties. First, most programs --
> even extremely large programs -- tend to involve only a handful of
> interesting thread roles, regardless of the number of actual threads
> at runtime.  The multi-million line programs we analyzed had fewer
> than a dozen thread roles, but had thousands of threads.  During our
> field testing, the largest number of distinct thread roles we
> encountered in any one program was under 30.
>
> Secondly, interesting thread roles tend to involve very large swathes
> of code.  For example, the GUI model described above essentially
> partitions the entire program into two bins: GUI code and not-GUI
> code. It should be obvious that although:
>
> > and all code and data that should be GUI-only is annotated as
> > requiring that lock
>
> is exactly what we want conceptually, asking programmers to annotate
> every method in the entire program is a guaranteed non-starter due to
> the amount of effort required.  Hence our focus on extremely
> light-weight annotations and extremely light-weight analysis (our goal
> was less than double the compilation time of whatever we were
> analyzing).  I believe that this tendency to be relevant to very large
> fractions of an application is very different from the typical
> use-case for the thread safety analysis, at least in terms of the
> annotations to be written (or inferred).
>
> It may, in fact, be possible to combine the capabilities of TRA and
> the current thread safety checking to produce something very
> interesting.  Let's talk about it!  But we believe that TRA offers
> sufficient benefits to warrant its inclusion as a feature in addition
> to the existing lock-based analysis.  They're similiar in that they
> both deal with thread safety, but solve the problem in different ways.
>
> > The existing clang annotations are used extensively on production
> > code; we have many hundreds of thousands of lines that are currently
> > annotated and checked.  It is not a research project.
>
> While the previous Java system described in the paper was a research
> project, it has progressed far past that point.  It (along with
> the other static analyses produced by members of the Fluid research
> group) was used sucessfully on millions of lines of production Java
> code (warts and all). The Fluid group's analyses (including TRA) were
> subsequently tech-transferred out into industry.  In any case, the
> Java side is definitely not research at this point.
>
> We're proposing to bring the same concepts over to the C-based family
> of languages because the problems TRA solves are mostly
> language-agnostic.  The extent of the research involved at this point
> is in finding the most obvious ways to express the capabilities; the
> underlying concept has already been proven useful and effective.
>
>

-- 
DeLesley Hutchins | Software Engineer | delesley at google.com | 505-206-0315
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130624/6966b2eb/attachment.html>