[cfe-dev] Suggested starter bug on clang analyzer codebase

Thu Apr 20 07:51:07 PDT 2017

Dear Dr. Artem,

I have the following queries.

   - Will I ever require SchrodingerLocked? (As you mentioned, a mutex
   becomes a Schrodinger only when pthread_mutex_destroy() is called. Now,
   calling pthread_mutex_destroy() after pthread_mutex_lock() is anyway wrong.
   So, a warning should be raised in this case.)
   - SVal corresponding to lck_mtx_destroy() is Unknown or Undefined.
   Hence, I am not able to obtain a symbol for the return value of
   lck_mtx_destroy() which is leading to incorrect output on the testcase
   pthreadlock.c.

I'm also attaching the modified code.
I want you to check certain parts of the code. (I have added capitalized
comments just before the parts which I want you to check.)

Thank you.

Regards,
Malhar
ᐧ

On Wed, Apr 19, 2017 at 12:11 PM, Artem Dergachev <noqnoqneo at gmail.com>
wrote:

> For completeness, i'd point out that there's a simpler alternative
> approach to this: force the analyzer to split the state immediately on
> pthread_mutex_destroy(): construct a state where the mutex is destroyed and
> symbol is known-to-be-zero, construct a state where the mutex remains the
> same and symbol is known-to-be-non-zero, and addTransition to both of them.
> This forces the analyzer to investigate two concrete states in parallel,
> rather than one "Schrödinger" state that would still be split when the
> return value is checked.
>
> However, that approach pays a huge price of doubling analysis time - the
> time we would spend in the "Schrödinger" state in the original approach
> would be doubled here (and if the return value is truly unchecked, that may
> go on until the end of the analysis, because even after the symbol dies the
> mutex state is still different, so the paths wouldn't be merged). This is
> why we try to avoid state splits unless it is definitely necessary.
>
>
> On 4/19/17 9:22 AM, Artem Dergachev wrote:
>
>> Yep!
>>
>> You are right that simply unlocking the failed-to-be-destroyed mutex in
>> checkDeadSymbols is not enough. The reason for that is, the moment when the
>> symbol dies is quite unpredictable, and we don't know how many things have
>> happened to the mutex between the failed destroy and the symbol death - it
>> might have been locked and unlocked many times in between.
>>
>> I'd propose that in order to maintain the correct mutex state you'd need
>> to check the destruction-return-value symbol in every operation as well.
>> For instance, when somebody tries to lock a mutex, see if it was destroyed
>> previously. If it was destroyed, look at the symbol:
>> * If the destruction was checked and failed on that execution path (as
>> you can find out by looking at the symbol's constraints in the program
>> state), then it's ok, remove that symbol from the program state and mark
>> the mutex as locked.
>> * If the destruction cannot be proven to have failed on this path, then
>> either it is known to have succeeded, or destruction wasn't checked for
>> failure properly on this path. It means we have a use-after-destroy, and we
>> report the warning accordingly.
>>
>> At the same time, checkDeadSymbols is still necessary: if no mutex state
>> changes happen after destroy before symbol death, we need to collect the
>> information from the dying symbol. Because the symbol is dying, this
>> information will no longer ever be made more accurate (that is, constraints
>> wouldn't ever become more strict anymore), so it's as accurate as possible.
>> So we can be sure if the symbol should be truly set to destroyed or
>> reverted to the previous state.
>>
>> Most path-sensitive checkers are kind of state machines (see also
>> "typestate"). They assign a state to an object, and then various events
>> (coming from checker callbacks) affect this state in one way or another.
>> You can draw the state diagram.
>>
>> So my proposal is to have the following states: Initialized, Locked,
>> Unlocked, SchrödingerInitialized, SchrödingerLocked, SchrödingerUnlocked,
>> Destroyed. Schrödinger states indicate that we're having a Schrödinger
>> mutex that is alive and destroyed at the same time on different execution
>> paths. The mutex becomes a Schrödinger mutex after pthread_mutex_destroy()
>> and remains there until the box is opened^W^W^W^W either we try to conduct
>> more operations on the mutex, or the return-symbol of
>> pthread_mutex_destroy() is dead. At such moments we collapse the mutex
>> state to either the previous state (eg. SchrödingerLocked -> Locked)  or to
>> Destroyed. The easiest way to represent Schrödinger states would be
>> something like "still Locked, but with destroy-return-symbol set present
>> for this mutex"; upon collapse we remove the return symbol.
>>
>> On 4/19/17 8:54 AM, Malhar Thakkar wrote:
>>
>>> Dear Dr. Artem,
>>>
>>> Thank you for your response. After incorporating 'addTransition', I was
>>> able to suppress the double-destroy. But, when 'pthread_mutex_destroy' was
>>> called for the second time (i.e., inside the if branch), LockState was
>>> found to be NULL (which it should be according to my code) but ideally it
>>> should be in the Unlocked state. For, it to be in Unlocked state, I think
>>> state->set<LockMap>() should be executed inside checkDeadSymbols() instead
>>> of executing it inside AcquireLock(), ReleaseLock(), DestroyLock() and
>>> InitLock() as I am unable to find a way to do it otherwise.
>>>
>>> Regards,
>>> Malhar
>>> ᐧ
>>>
>>> On Tue, Apr 18, 2017 at 8:30 PM, Artem Dergachev <noqnoqneo at gmail.com
>>> <mailto:noqnoqneo at gmail.com>> wrote:
>>>
>>>     Before anything, please note that `State' is your local variable;
>>>     assigning to `State' doesn't immediately affect analysis in any
>>>     way. In order to make your changes to the program state take
>>>     effect, you'd need to put the changed state back into the
>>>     `CheckerContext' by calling its `addTransition' method, like other
>>>     checker callbacks do.
>>>
>>>     Otherwise, your code looks reasonable to me. If everything is
>>>     implemented correctly, i believe that your code would already
>>>     suppress the double-destroy warning on this branch.
>>>
>>>
>>>     On 4/18/17 5:53 PM, Malhar Thakkar wrote:
>>>
>>>         Test-case:
>>>
>>>         pthread_mutex_lock(&mtx1);
>>>         pthread_mutex_unlock(&mtx1);
>>>         int retval = pthread_mutex_destroy(&mtx1);
>>>         if(retval != 0){
>>>         pthread_mutex_destroy(&mtx1); (1)
>>>         }
>>>         else
>>>         printf("Already destroyed\n");
>>>
>>>
>>>         Regarding my previous query about reverting the LockState, how
>>>         would I ensure that the LockState at (1) is Unlocked and not
>>>         Destroyed?
>>>
>>>         Following is the code for checkDeadSymbols()
>>>         void PthreadLockChecker::checkDeadSymbols(SymbolReaper
>>> &SymReaper,
>>>                            CheckerContext &C) const {
>>>           ProgramStateRef State = C.getState();
>>>           ConstraintManager &CMgr = State->getConstraintManager();
>>>
>>>           RetValConstraintTy TrackedSymbols =
>>>         State->get<RetValConstraint>();
>>>           for (RetValConstraintTy::iterator I = TrackedSymbols.begin(),
>>>              E = TrackedSymbols.end(); I != E; ++I) {
>>>             SymbolRef Sym = I->second;
>>>             const MemRegion* lockR = I->first;
>>>             bool IsSymDead = SymReaper.isDead(Sym);
>>>             const LockState* LState = State->get<LockMap>(lockR);
>>>             // Remove the dead symbol from the return value symbols map.
>>>             if (IsSymDead){
>>>               ConditionTruthVal retZero = CMgr.isNull(State, Sym);
>>>         if(retZero.isConstrainedFalse()){
>>>                 // Update LockMap
>>>                 State = State->remove<LockMap>(lockR); // I know this
>>>         is incorrect but even after removing this entry
>>>     from the map, (1) raises a warning saying, "This lock has
>>>         already been destroyed".
>>>               }
>>>               State = State->remove<RetValConstraint>(lockR);
>>>             }
>>>           }
>>>         }
>>>
>>>         Thank you.
>>>
>>>
>>>         Regards,
>>>         Malhar
>>>
>>>
>>>         ᐧ
>>>
>>>         On Mon, Apr 17, 2017 at 8:34 PM, Devin Coughlin
>>>         <dcoughlin at apple.com <mailto:dcoughlin at apple.com>
>>>         <mailto:dcoughlin at apple.com <mailto:dcoughlin at apple.com>>>
>>> wrote:
>>>
>>>
>>>                 On Apr 17, 2017, at 4:55 AM, Malhar Thakkar
>>>             <cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>
>>>                 <mailto:cs13b1031 at iith.ac.in
>>>             <mailto:cs13b1031 at iith.ac.in>>> wrote:
>>>
>>>                 Consider the following test-case:
>>>
>>>                 int retval = pthread_mutex_destroy(&mtx1);
>>>                 if(retval == 0){
>>>                 pthread_mutex_lock(&mtx1);
>>>                 }
>>>
>>>
>>>                 REGISTER_MAP_WITH_PROGRAMSTATE(RetValConstraint, const
>>>             MemRegion
>>>                 *, SymbolRef)
>>>
>>>                 I have added the following snippet for checkDeadSymbols:
>>>                 void PthreadLockChecker::checkDeadSymbols(SymbolReaper
>>>             &SymReaper,
>>>                  CheckerContext &C) const {
>>>                   // std::cout<<"Checking dead symbols\n";
>>>                 ProgramStateRef State = C.getState();
>>>                 ConstraintManager &CMgr = State->getConstraintManager();
>>>
>>>                 RetValConstraintTy TrackedSymbols =
>>>             State->get<RetValConstraint>();
>>>                   // int counter = 0;
>>>                   for (RetValConstraintTy::iterator I =
>>>             TrackedSymbols.begin(),
>>>                                        E = TrackedSymbols.end(); I !=
>>>             E; ++I) {
>>>                 // counter++;
>>>                 // std::cout << "Counter: "<<counter<<std::endl;
>>>                 SymbolRef Sym = I->second;
>>>                 const MemRegion* lockR = I->first;
>>>                 bool IsSymDead = SymReaper.isDead(Sym);
>>>
>>>                 // Remove the dead symbol from the return value
>>>             symbols map.
>>>                 if (IsSymDead){
>>>                 ConditionTruthVal retZero = CMgr.isNull(State, Sym);
>>>                 if(retZero.isConstrainedFalse()){
>>>                   std::cout<<"False\n";
>>>                   // Update LockMap
>>>                 }
>>>                 else if(retZero.isConstrainedTrue()){
>>>                   std::cout<<"True\n";
>>>                 }
>>>                 State = State->remove<RetValConstraint>(lockR);
>>>                 std::cout<<"Removed\n";
>>>                     }
>>>                   }
>>>                 }
>>>
>>>                 Now, after the execution of
>>>             PthreadLockChecker::DestroyLock(),
>>>                 checkDeadSymbols() is executed twice before
>>>                 PthreadLockChecker::AcquireLock() is executed.
>>>
>>>                 When checkDeadSymbols is first executed, there is just
>>>             one symbol
>>>                 in RetValConstraint(trivial). But, this symbol is
>>>             constrained
>>>                 /not /to be NULL. Then, the symbol is removed. During the
>>>
>>>                 execution of checkDeadSymbols for the second time,
>>>             there is again
>>>                 one symbol in RetValConstraint(I don't understand
>>>             this. Shouldn't
>>>                 this symbol have been destroyed during the previous
>>>             execution?).
>>>                 Nevertheless, this time, this symbol is constrained to
>>>             be NULL.
>>>
>>>
>>>             The static analyzer performs a path-sensitive exploration
>>>         of the
>>>             program. This means that when it sees a branch it will
>>> explore
>>>             both both sides of the branch independently (unless it can
>>>         tell
>>>             that one side is infeasible). In this case it means the
>>>         analyzer
>>>             will explore the path when retval is 0 and when it is not.
>>>         If you
>>>             haven’t read it yet, the “Debugging” section the the checker
>>>             development manual describes some tricks for how to
>>>         visualize the
>>>             analyzer’s exploration.
>>> <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands
>>> <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>
>>> <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands
>>> <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>>>.
>>>             I find the debug.ViewExplodedGraph checker particularly
>>>         helpful.
>>>
>>>                 Also, I have one more query. After fixing the above
>>>             mentioned
>>>                 issue, how do I update LockMap in the sense that I'll
>>>             need to
>>>                 revert the change made to it if the symbol is
>>>             constrained not to
>>>                 be NULL. Apart from keeping track of the previous
>>>             LockState, is
>>>                 there any other way to do it?
>>>
>>>
>>>             Since the analyzer keeps its representation of the program
>>>         state
>>>             per-path and explores paths independently, you shouldn’t
>>>         need to
>>>             do any reverting to undo the effects of simulating other
>>>         paths to
>>>             reaching a particular program point.
>>>
>>>             Devin
>>>
>>>
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170420/f6df89ab/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PthreadLockChecker.cpp
Type: text/x-c++src
Size: 15802 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170420/f6df89ab/attachment.cpp>