[cfe-dev] Next step
Delesley Hutchins
delesley at google.com
Mon May 19 07:44:00 PDT 2014
BDDs are what we really want, but we didn't want to link a giant
3rd-party library into clang, so we've been talking about small
light-weight alternatives. If buddy is small, then it's definitely
worth looking into. Thanks!
-DeLesley
On Sat, May 17, 2014 at 4:31 AM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
> FWIW, we have been using buddy (http://sourceforge.net/projects/buddy/), an
> implementation of BDDs.
> This library is pretty small and performs well. It has a BSD-like license.
> It compiles at least with GCC, clang, and MSVC.
> The project was a bit dead for a long time, but it restarted recently. We
> have been contributing our local patches as well.
>
> (don't know if using BDDs isn't overkill for your needs, but I didn't follow
> the thread closely).
>
> Nuno
>
>
>
> ----- Original Message -----
>>
>> I think BDDs are definitely the best long-term solution, and possible
>> the best short-term solution. The LExpr stuff is useful for rapid
>> prototyping, and it's also fast and efficient to build an LExpr from
>> the requires(...) attribute for the purpose of checking. However, you
>> can't write efficient algorithms for LExpr. BDDs are basically the
>> algorithm you want here. Dunno about 3rd-party libraries; my concern
>> would be finding something very small, so that you don't wind up
>> increasing the clang memory footprint with a feature that isn't used
>> all that often. Rolling your own BDD library is definitely an option;
>> we don't need that much functionality, so it might be best to just
>> implement what you need and nothing more.
>>
>> -DeLesley
>>
>>
>> On Thu, May 15, 2014 at 7:16 AM, Aaron Ballman <aaron at aaronballman.com>
>> wrote:
>>>
>>> On Wed, May 14, 2014 at 11:50 AM, Delesley Hutchins <delesley at google.com>
>>> wrote:
>>>>
>>>> [Ongoing discussion on thread safety analysis, moved to list in case
>>>> anyone cares. :-)]
>>>>
>>>>> * change getMutexIDs so that it creates an LExpr instead of a set of
>>>>> SExprs. Instead of returning a MutexIDList, it should return the LExpr
>>>>> created.
>>>>
>>>>
>>>> Yes.
>>>>
>>>>> * addLock should be changed to take an LExpr instead of a MutexIDList
>>>>> (same for removeLock)
>>>>
>>>>
>>>> Yes.
>>>>
>>>>> * FactSet needs to be updated to handle LExprs for findLock, addLock
>>>>> and removeLock
>>>>
>>>>
>>>> Yes, but there's more to it. (See below.)
>>>>
>>>>> * FactManager needs to create a FactID from an LExpr for newLock
>>>>
>>>>
>>>> Not quite. (See below.)
>>>>
>>>>> * FactEntry needs to be updated to hold (a union including SExpr and?)
>>>>> an LExpr
>>>>
>>>>
>>>> Not quite. (See below.)
>>>>
>>>> A few notes on implementation: MutexIDList is just a short list of
>>>> SExprs, and it's used when extracting SExprs from attributes, because
>>>> an attribute (e.g. exclusive_locks_required) can hold a list of
>>>> SExprs, rather than a single one.
>>>
>>>
>>> Good, that matches my understanding.
>>>
>>>> FactSet holds a set of SExprs that are known to be held. It's not
>>>> just a list for two reasons. (1) A FactSet keeps track of some
>>>> additional information about each SExpr, e.g. where the SExprs were
>>>> acquired, whether they are exclusive or shared, and the weirdness
>>>> associated with scoped_lockable. (2) FactSets are optimized for small
>>>> size and merging.
>>>
>>>
>>> This also matches my understanding.
>>>
>>>>
>>>> You will need to change FactSet so that it holds a single LExpr.
>>>> Instead of the set {A, B, C}, you will store the LExpr (A & B & C).
>>>> What this means is that the operations to add locks, remove locks, and
>>>> merge locksets will now apply to LExprs.
>>>
>>>
>>> Okay, so the FactSet basically becomes our computed environment for
>>> capabilities, and is simply a single LExpr that we constantly update?
>>>
>>>> For example:
>>>>
>>>> Current context: (A & B & !C). Call C.lock(). New context:
>>>> (A & B & C).
>>>> Current context: (A & B & C). Call B.unlock(). New context:
>>>> (A & C & !B).
>>>> Context1: (A & B & C). Context2: (A & B & D). Merge on join
>>>> point: (A & B & (C | D))
>>>>
>>>> [Joint points]. At every join point in the CFG, the analysis will
>>>> have to merge the two LExprs.
>>>
>>>
>>> So this would mean modifying ThreadSafetyAnalyzer::intersectAndWarn?
>>> Or are there other places requiring modification?
>>>
>>>> Merging is conceptually simple: if we
>>>> can arrive at a given basic block from either block 1, with context
>>>> E1, or from block 2, with context E2, then the context for that block
>>>> is (E1 | E2). However, you will get an exponential blowup in
>>>> expression size unless you *simplify* (E1 | E2) afterward, as shown
>>>> above.
>>>
>>>
>>> Do you have a particular algorithm in mind for performing the
>>> simplification?
>>>
>>>>
>>>> [Adding and removing locks.] In order to add or remove a mutex A from
>>>> an LExpr E, you must first remove all prior occurrences of A from E.
>>>> (We're chan do not know ging what we know about A.) Unfortunately,
>>>> that's not just a simple matter of traversing E, and deleting the
>>>> references. You need to derive a new expression E' which does not
>>>> mention A, where E is satisfiable only if E' is satisfiable. More
>>>> formally, you must factor out A, by refactoring E into the form (A &
>>>> E1) | (!A & E2), where E1 and E2 do not refer to A; E' is then (E1 |
>>>> E2). There are algorithms to do this for BDDs; the operation is known
>>>> as boolean existential quantification. See:
>>>> http://gauss.ececs.uc.edu/Courses/c626/lectures/BDD/bdd-desc.pdf
>>>
>>>
>>> Thank you for the link!
>>>
>>>>
>>>> Unfortunately, I do not know how to do this for the simple LExprs that
>>>> you have constructed. One option is to require the LExpr to be in the
>>>> format (E1 & E2 ... En & A) to start with (so no refactoring
>>>> necessary), and simply fail otherwise. That will work for locks,
>>>> because it matches the way FactSet is implemented now, and it will
>>>> thus allow you to make forward progress. But it largely defeats the
>>>> point of implementing LExprs, since analysis will fail if it
>>>> encounters anything more complicated than what we already have. The
>>>> other option is to ditch LExprs, and implement BDDs. Talk to Dean; he
>>>> knows more about this than I do.
>>>
>>>
>>> It sounds a lot more like we really want a BDD than an LExpr. I will
>>> talk to Dean, but do you have any feelings on whether a BDD is a
>>> better approach than the LExpr we have? If you think BDDs are the
>>> correct approach long-term, do you know of any BDD libraries that
>>> would be compatible with our licensing requirements, or whether there
>>> would be issues with bringing in a 3rd party library for this?
>>>
>>>> My task list, in order of priority, would be:
>>>>
>>>> (1) Add the ability to merge (logical or) and simplify LExprs.
>>>> (convert to conjuctive normal form?)
>>>> (2) Add the existential quantification operation to LExprs (or
>>>> possibly implement BDDs?).
>>>> (3) Augment LExprs so that they keep track of exclusive vs. shared,
>>>> where locks were acquired, and scoped_lockable.
>>>
>>>
>>> I think the highest priority would be to determine whether we are
>>> going to use LExrs, BDDs, or a combination of the two. The work is
>>> likely to be roughly the same in terms of hooking things up, but the
>>> tools are different.
>>>
>>>> That's quite a lot of work already, and it will take you some time.
>>>> Once it's done, *then* we can do a single patch to swap FactSet over
>>>> to the new LExpr stuff. The patch sequence is the same as my strategy
>>>> for the TIL/SExpr stuff; build the new functionality in parallel with
>>>> the old, make sure the new functionality works, and then flip the
>>>> switch from old to new. Notice that I haven't flipped the switch for
>>>> SExpr yet. :-)
>>>>
>>>> We can set up a video chat if you want to talk about this further.
>>>
>>>
>>> I think a video chat would be a great idea. We can discuss offlist.
>>>
>>> Thanks!
>>>
>>> ~Aaron
>
>
--
DeLesley Hutchins | Software Engineer | delesley at google.com | 505-206-0315
More information about the cfe-dev
mailing list