[cfe-dev] Next step

Thu May 15 07:16:07 PDT 2014

On Wed, May 14, 2014 at 11:50 AM, Delesley Hutchins <delesley at google.com> wrote:
> [Ongoing discussion on thread safety analysis, moved to list in case
> anyone cares.  :-)]
>
>> * change getMutexIDs so that it creates an LExpr instead of a set of
>> SExprs. Instead of returning a MutexIDList, it should return the LExpr
>> created.
>
> Yes.
>
>> * addLock should be changed to take an LExpr instead of a MutexIDList
>> (same for removeLock)
>
> Yes.
>
>> * FactSet needs to be updated to handle LExprs for findLock, addLock
>> and removeLock
>
> Yes, but there's more to it.  (See below.)
>
>> * FactManager needs to create a FactID from an LExpr for newLock
>
> Not quite.  (See below.)
>
>> * FactEntry needs to be updated to hold (a union including SExpr and?) an LExpr
>
> Not quite.  (See below.)
>
> A few notes on implementation:  MutexIDList is just a short list of
> SExprs, and it's used when extracting SExprs from attributes, because
> an attribute (e.g. exclusive_locks_required) can hold a list of
> SExprs, rather than a single one.

Good, that matches my understanding.

> FactSet holds a set of SExprs that are known to be held.  It's not
> just a list for two reasons.  (1) A FactSet keeps track of some
> additional information about each SExpr, e.g. where the SExprs were
> acquired, whether they are exclusive or shared, and the weirdness
> associated with scoped_lockable.  (2) FactSets are optimized for small
> size and merging.

This also matches my understanding.

>
> You will need to change FactSet so that it holds a single LExpr.
> Instead of the set {A, B, C},  you will store the LExpr  (A & B & C).
> What this means is that the operations to add locks, remove locks, and
> merge locksets will now apply to LExprs.

Okay, so the FactSet basically becomes our computed environment for
capabilities, and is simply a single LExpr that we constantly update?

> For example:
>
>    Current context:  (A & B & !C).    Call  C.lock().   New context:
> (A & B & C).
>    Current context:  (A & B & C).     Call  B.unlock().  New context:
> (A & C & !B).
>    Context1:  (A & B & C).    Context2:  (A & B & D).   Merge on join
> point:  (A & B & (C | D))
>
> [Joint points].  At every join point in the CFG, the analysis will
> have to merge the two LExprs.

So this would mean modifying ThreadSafetyAnalyzer::intersectAndWarn?
Or are there other places requiring modification?

> Merging is conceptually simple:  if we
> can arrive at a given basic block from either block 1, with context
> E1, or from block 2, with context E2, then the context for that block
> is (E1 | E2).   However, you will get an exponential blowup in
> expression size unless you *simplify* (E1 | E2) afterward, as shown
> above.

Do you have a particular algorithm in mind for performing the simplification?

>
> [Adding and removing locks.]  In order to add or remove a mutex A from
> an LExpr E, you must first remove all prior occurrences of A from E.
> (We're chan do not know ging what we know about A.)  Unfortunately,
> that's not just a simple matter of traversing E, and deleting the
> references.  You need to derive a new expression E' which does not
> mention A, where E is satisfiable only if E' is satisfiable.  More
> formally, you must factor out A, by refactoring E into the form  (A &
> E1) | (!A & E2), where E1 and E2 do not refer to A;  E' is then (E1 |
> E2).  There are algorithms to do this for BDDs; the operation is known
> as boolean existential quantification. See:
> http://gauss.ececs.uc.edu/Courses/c626/lectures/BDD/bdd-desc.pdf

Thank you for the link!

>
> Unfortunately, I do not know how to do this for the simple LExprs that
> you have constructed.  One option is to require the LExpr to be in the
> format  (E1 & E2 ... En & A)  to start with (so no refactoring
> necessary), and simply fail otherwise.  That will work for locks,
> because it matches the way FactSet is implemented now, and it will
> thus allow you to make forward progress.  But it largely defeats the
> point of implementing LExprs, since analysis will fail if it
> encounters anything more complicated than what we already have.  The
> other option is to ditch LExprs, and implement BDDs. Talk to Dean; he
> knows more about this than I do.

It sounds a lot more like we really want a BDD than an LExpr. I will
talk to Dean, but do you have any feelings on whether a BDD is a
better approach than the LExpr we have? If you think BDDs are the
correct approach long-term, do you know of any BDD libraries that
would be compatible with our licensing requirements, or whether there
would be issues with bringing in a 3rd party library for this?

> My task list, in order of priority, would be:
>
> (1) Add the ability to merge (logical or) and simplify LExprs.
> (convert to conjuctive normal form?)
> (2) Add the existential quantification operation to LExprs  (or
> possibly implement BDDs?).
> (3) Augment LExprs so that they keep track of exclusive vs. shared,
> where locks were acquired, and scoped_lockable.

I think the highest priority would be to determine whether we are
going to use LExrs, BDDs, or a combination of the two. The work is
likely to be roughly the same in terms of hooking things up, but the
tools are different.

> That's quite a lot of work already, and it will take you some time.
> Once it's done, *then* we can do a single patch to swap FactSet over
> to the new LExpr stuff.  The patch sequence is the same as my strategy
> for the TIL/SExpr stuff; build the new functionality in parallel with
> the old, make sure the new functionality works, and then flip the
> switch from old to new.  Notice that I haven't flipped the switch for
> SExpr yet.  :-)
>
> We can set up a video chat if you want to talk about this further.

I think a video chat would be a great idea. We can discuss offlist.

Thanks!

~Aaron