[cfe-dev] [RFC] Adding Thread Role Analysis

Fri Jun 28 14:02:08 PDT 2013

I understand that locks and roles are different in some sort of high-level
cognitive sense.  However, I believe that the basic annotations and
analysis are exactly the same; the difference lies only in how they are
used.  In particular, you need to:

(a) Mark data as being protected by a lock/role.
(b) Mark functions as being protected by a lock/role.
(c) Identify when a thread acquires and releases a lock/role.
(c) Warn whenever a data/function is accessed/called without holding the
proper lock/role.

Roles are globally defined, acquired on thread creation, and seldom or
never released, while locks may be stored in objects and are much more
fine-grained.  However, this indicates to me that roles are simply a subset
of locks, not a completely different thing.

I very strongly believe that we should have one set of annotations, and one
analysis for doing these checks.  I do not want to introduce a second set
of annotations that does exactly the same thing, or a second analysis that
is almost like the first, but different in some subtle way.  That will only
be confusing to people who are actually trying to use this stuff for
practical programming.

> Thread role analysis operates on policy models such as "Only the GUI event
thread may invoke [long-list-of-methods]."

Lock analysis establishes the same sort of policy.  You must hold 'lock' to
call [long list of methods].

> Section 5.2 of the PPoPP paper gives the full explanation.  In brief, it
runs
> off the idea that annotations are propagated from one function to another
along
> the call graph.  Eg)

Lock annotations are propagated in exactly the same way.  The difference is
that the current system does not infer missing annotations, so it will
issue a warning instead:

> void h() LOCKS_REQUIRED(mu_) {}
> void g() { h(); }  // warning -- missing LOCKS_REQUIRED(mu_) annotation

There are two reasons why I have not attempted to do inference yet.  First,
inference requires whole-program analysis, which is incompatible with
compiling by translation unit.  And second, with locks, it is impossible to
determine whether the programmer is supposed to acquire mu_ within g(), or
whether g() should be marked with LOCKS_REQUIRED.  With roles, you know
that it's probably missing the annotation.

I propose a path forward to resolve these issues.

(1) Reuse the current annotations as much as possible.  If there are things
that just can't be done with the existing annotations (you mentioned
complex boolean conditions on roles), then let's come up with a minimal
extension to support what you want to do.  I suspect that many such
extensions would actually be valuable for locks as well.  Most of the
existing annotations, GUARDED_BY, LOCKS_REQUIRED, etc. can be directly
mapped onto your system, and can be re-used as-is.

(2) Implement role inference as a separate, whole-program analysis pass,
using sidecar files etc.  I am willing to believe that role inference may
be different from lock inference, although I'm not entirely convinced that
the two cannot be merged.  But it seems to me that inference is what really
distinguishes your work from the existing system, so let's put it in a
separate tool.

(3) Reuse the current analysis as much as possible.  IMHO, it is always
better to logically separate inference from checking; checking assumes a
fully annotated program, while inference will infer missing annotations.
 Your inference pass should thus supply the missing annotations, and then
invoke the existing analysis to issue the warnings.  If the existing
analysis needs to be extended, refactored, or fixed in some way in order to
make this work, then let's do that; our goal is to keep the two analyses in
sync.

> there does not appear to be enough similarity
> to attempt to use the existing lock-based infrastructure as a jumping-off
> point.

I completely disagree.  If one can be mapped into the other, as you
yourself have said, then how are they not similar?

  -DeLesley

On Fri, Jun 28, 2013 at 12:11 PM, Dean Sutherland <dsutherland at cert.org>wrote:

>  TRA and the current thread safety checking seem to us to be different
> bike sheds, both of which are interesting.  For more, see below.
>
>  On Jun 24, 2013, at 2:19 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
>
>  On Fri, Jun 21, 2013 at 1:31 PM, Dean Sutherland <dsutherland at cert.org>wrote:
>
>> It may, in fact, be possible to combine the capabilities of TRA and
>>  the current thread safety checking to produce something very
>> interesting.  Let's talk about it!  But we believe that TRA offers
>> sufficient benefits to warrant its inclusion as a feature in addition
>> to the existing lock-based analysis.  They're similiar in that they
>> both deal with thread safety, but solve the problem in different ways.
>>
>
>  I will freely admit that I've not read all of this thread (it is *very*
> long, where possible brevity would be helpful to get a wider audience), I
> wanted to expand on the first email I sent to Aaron in response to this
> comment.
>
>
>  I think it is fundamentally important to approach contributing this type
> of work to clang as *incremental* improvements to the existing
> functionality. It would be very disruptive to the project to have a second
> analysis system surrounding thread safety. I'm not saying that the existing
> functionality is perfect or must be exactly preserved, I'm just saying that
> you should propose new functionality via an incremental path from where we
> are today.
>
> [SNIP]
>
>   I think these issues will (to a certain extent) be just as important as
> the concerns of basing this on sound academic research, and having a good
> theoretical model behind the analysis and diagnostics produced. As an
> example, I think one thing that is actively hurting the community in
> understanding your proposal is trying to first get an existing community to
> shift terminology to that of specific research papers, and then describing
> what you want in those terms. Maybe it would be possible to instead use the
> existing terminology, or where it isn't good terminology correct the
> terminology of the existing system before trying to describe new things in
> terms of it?
>
>
>  The single biggest difference between the existing lock analysis and
> thread role analysis is that we're addressing different (but
> partly-overlapping) issues.
>
> Lock analysis starts with some data that must be protected from
> multi-threaded access.  You then annotate the data to indicate the
> specific lock that protects it, annotate methods to indicate how they
> interact with the locks.  Analysis involves tracking data references
> (which is the really hard part!) and then running the lock-set algorithm
> to make sure you never have an unprotected access. This is great stuff!
> This ensures that a program has correct and consistent use of locks and
> protection of data.
>
> Thread role analysis operates on policy models such as "Only the GUI
> event thread may invoke [long-list-of-methods]." This is a common
> idiom in many widely-used frameworks, for example AppKit's policy
> of always operating from the primary thread.  Many of these policies exist
> because fine-grained locking was too difficult, too expensive, or even
> impossible—perhaps due to deadlock potential—for the problem at hand.
> Other frameworks use a "no concurrency" model, outsourcing concurrency from
> the application to the framework. Although this serves to provide thread
> confinement of data, it also places restrictions on what *functions* may be
> invoked, independent of the *data* those objects may touch. For example:
> such
> code may never start a thread running, grab a thread from a pool, etc.  The
> Actor design pattern provides an example of the "no concurrency in the
> client" approach.
>
> Obviously, these two analyses are related. Policy-based concurrency
> sometimes serves as a proxy for lock-based concurrency. But policy-based
> concurrency is a subset of the use-case for TRA. When programmers know
> which thread roles may execute a particular function, this provides
> some guidance regarding what they can or should invoke from that
> function.  For example, an astronomy application we analyzed
> compartmentalized things like computation, printing, disk and network i/o,
> etc. onto individual threads.  This separation was partly for GUI
> responsiveness, performance and maintenance, all of which has limited
> relation to locking. The thread roles were purely a useful abstraction for
> enforcing developer policy.
>
> We feel that TRA and lock-based analysis are complementary, but
> different, systems that can benefit from each other.  There's nothing
> inherently disruptive about TRA to the overall architecture of clang
> (it's mostly making use of existing mechanisms); it can be implemented
> in an incremental fashion for easier community involvement.
> Ultimately, we believe there are considerable benefits to integrating
> the two systems together, but there does not appear to be enough similarity
> to attempt to use the existing lock-based infrastructure as a jumping-off
> point.
>
>  Dean Sutherland
> (with much help from Aaron Ballman)
>
>

-- 
DeLesley Hutchins | Software Engineer | delesley at google.com | 505-206-0315
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130628/8dcce909/attachment.html>