[cfe-dev] Next step

Delesley Hutchins delesley at google.com
Thu May 15 08:01:27 PDT 2014


I think BDDs are definitely the best long-term solution, and possible
the best short-term solution.  The LExpr stuff is useful for rapid
prototyping, and it's also fast and efficient to build an LExpr from
the requires(...) attribute for the purpose of checking.  However, you
can't write efficient algorithms for LExpr.  BDDs are basically the
algorithm you want here.  Dunno about 3rd-party libraries; my concern
would be finding something very small, so that you don't wind up
increasing the clang memory footprint with a feature that isn't used
all that often.  Rolling your own BDD library is definitely an option;
we don't need that much functionality, so it might be best to just
implement what you need and nothing more.

  -DeLesley


On Thu, May 15, 2014 at 7:16 AM, Aaron Ballman <aaron at aaronballman.com> wrote:
> On Wed, May 14, 2014 at 11:50 AM, Delesley Hutchins <delesley at google.com> wrote:
>> [Ongoing discussion on thread safety analysis, moved to list in case
>> anyone cares.  :-)]
>>
>>> * change getMutexIDs so that it creates an LExpr instead of a set of
>>> SExprs. Instead of returning a MutexIDList, it should return the LExpr
>>> created.
>>
>> Yes.
>>
>>> * addLock should be changed to take an LExpr instead of a MutexIDList
>>> (same for removeLock)
>>
>> Yes.
>>
>>> * FactSet needs to be updated to handle LExprs for findLock, addLock
>>> and removeLock
>>
>> Yes, but there's more to it.  (See below.)
>>
>>> * FactManager needs to create a FactID from an LExpr for newLock
>>
>> Not quite.  (See below.)
>>
>>> * FactEntry needs to be updated to hold (a union including SExpr and?) an LExpr
>>
>> Not quite.  (See below.)
>>
>> A few notes on implementation:  MutexIDList is just a short list of
>> SExprs, and it's used when extracting SExprs from attributes, because
>> an attribute (e.g. exclusive_locks_required) can hold a list of
>> SExprs, rather than a single one.
>
> Good, that matches my understanding.
>
>> FactSet holds a set of SExprs that are known to be held.  It's not
>> just a list for two reasons.  (1) A FactSet keeps track of some
>> additional information about each SExpr, e.g. where the SExprs were
>> acquired, whether they are exclusive or shared, and the weirdness
>> associated with scoped_lockable.  (2) FactSets are optimized for small
>> size and merging.
>
> This also matches my understanding.
>
>>
>> You will need to change FactSet so that it holds a single LExpr.
>> Instead of the set {A, B, C},  you will store the LExpr  (A & B & C).
>> What this means is that the operations to add locks, remove locks, and
>> merge locksets will now apply to LExprs.
>
> Okay, so the FactSet basically becomes our computed environment for
> capabilities, and is simply a single LExpr that we constantly update?
>
>> For example:
>>
>>    Current context:  (A & B & !C).    Call  C.lock().   New context:
>> (A & B & C).
>>    Current context:  (A & B & C).     Call  B.unlock().  New context:
>> (A & C & !B).
>>    Context1:  (A & B & C).    Context2:  (A & B & D).   Merge on join
>> point:  (A & B & (C | D))
>>
>> [Joint points].  At every join point in the CFG, the analysis will
>> have to merge the two LExprs.
>
> So this would mean modifying ThreadSafetyAnalyzer::intersectAndWarn?
> Or are there other places requiring modification?
>
>> Merging is conceptually simple:  if we
>> can arrive at a given basic block from either block 1, with context
>> E1, or from block 2, with context E2, then the context for that block
>> is (E1 | E2).   However, you will get an exponential blowup in
>> expression size unless you *simplify* (E1 | E2) afterward, as shown
>> above.
>
> Do you have a particular algorithm in mind for performing the simplification?
>
>>
>> [Adding and removing locks.]  In order to add or remove a mutex A from
>> an LExpr E, you must first remove all prior occurrences of A from E.
>> (We're chan do not know ging what we know about A.)  Unfortunately,
>> that's not just a simple matter of traversing E, and deleting the
>> references.  You need to derive a new expression E' which does not
>> mention A, where E is satisfiable only if E' is satisfiable.  More
>> formally, you must factor out A, by refactoring E into the form  (A &
>> E1) | (!A & E2), where E1 and E2 do not refer to A;  E' is then (E1 |
>> E2).  There are algorithms to do this for BDDs; the operation is known
>> as boolean existential quantification. See:
>> http://gauss.ececs.uc.edu/Courses/c626/lectures/BDD/bdd-desc.pdf
>
> Thank you for the link!
>
>>
>> Unfortunately, I do not know how to do this for the simple LExprs that
>> you have constructed.  One option is to require the LExpr to be in the
>> format  (E1 & E2 ... En & A)  to start with (so no refactoring
>> necessary), and simply fail otherwise.  That will work for locks,
>> because it matches the way FactSet is implemented now, and it will
>> thus allow you to make forward progress.  But it largely defeats the
>> point of implementing LExprs, since analysis will fail if it
>> encounters anything more complicated than what we already have.  The
>> other option is to ditch LExprs, and implement BDDs. Talk to Dean; he
>> knows more about this than I do.
>
> It sounds a lot more like we really want a BDD than an LExpr. I will
> talk to Dean, but do you have any feelings on whether a BDD is a
> better approach than the LExpr we have? If you think BDDs are the
> correct approach long-term, do you know of any BDD libraries that
> would be compatible with our licensing requirements, or whether there
> would be issues with bringing in a 3rd party library for this?
>
>> My task list, in order of priority, would be:
>>
>> (1) Add the ability to merge (logical or) and simplify LExprs.
>> (convert to conjuctive normal form?)
>> (2) Add the existential quantification operation to LExprs  (or
>> possibly implement BDDs?).
>> (3) Augment LExprs so that they keep track of exclusive vs. shared,
>> where locks were acquired, and scoped_lockable.
>
> I think the highest priority would be to determine whether we are
> going to use LExrs, BDDs, or a combination of the two. The work is
> likely to be roughly the same in terms of hooking things up, but the
> tools are different.
>
>> That's quite a lot of work already, and it will take you some time.
>> Once it's done, *then* we can do a single patch to swap FactSet over
>> to the new LExpr stuff.  The patch sequence is the same as my strategy
>> for the TIL/SExpr stuff; build the new functionality in parallel with
>> the old, make sure the new functionality works, and then flip the
>> switch from old to new.  Notice that I haven't flipped the switch for
>> SExpr yet.  :-)
>>
>> We can set up a video chat if you want to talk about this further.
>
> I think a video chat would be a great idea. We can discuss offlist.
>
> Thanks!
>
> ~Aaron



-- 
DeLesley Hutchins | Software Engineer | delesley at google.com | 505-206-0315



More information about the cfe-dev mailing list