[cfe-dev] Suggested starter bug on clang analyzer codebase

Thu Apr 20 10:58:00 PDT 2017

Malhar,

It would be great if you could upload your diff to our Phabricator at http://reviews.llvm.org — this is the online tool we use for code reviews. We can comment on specific code (rather than the overall approach) there.

There are instructions for using Phabricator at http://llvm.org/docs/Phabricator.html

Devin

> On Apr 20, 2017, at 9:16 AM, Artem Dergachev <noqnoqneo at gmail.com> wrote:
> 
> Yep, you're right, no need for SchrodingerLocked!
> 
> (we'd need to come up with better names, because "Shrodinger mutex" is something i just came up with for explaining... Eg., UnlockedAndPossiblyDestroyed. Or just remove the new verbose enum values for the new states and look at if the symbol is present in the DestroyRetVal map instead). Note that when you have a DestroyRetVal symbol for a mutex, then the mutex  *definitely* has a state, which is one of the "Shrodinger" states, and vice versa; you can replace quite a lot of your "else" branches with *assertions*, because that's entirely within your checker's area of expertise to keep it that way. You may also probably try to de-duplicate some code into auxiliary functions.
> 
> lck_mtx_destroy() is void. It doesn't have a return value. Seems that it's good as is :) You can vary behavior depending on the LockingSemantics argument (if it's XNUSemantics, skip stuff).
> 
> I'd probably also have a closer look tomorrow.
> 
> 
> 
> 
> On 4/20/17 5:51 PM, Malhar Thakkar wrote:
>> Dear Dr. Artem,
>> 
>> I have the following queries.
>> 
>>  * Will I ever require SchrodingerLocked? (As you mentioned, a mutex
>>    becomes a Schrodinger only when pthread_mutex_destroy() is called.
>>    Now, calling pthread_mutex_destroy() after pthread_mutex_lock() is
>>    anyway wrong. So, a warning should be raised in this case.)
>>  * SVal corresponding to lck_mtx_destroy() is Unknown or Undefined.
>>    Hence, I am not able to obtain a symbol for the return value of
>>    lck_mtx_destroy() which is leading to incorrect output on the
>>    testcase pthreadlock.c.
>> 
>> I'm also attaching the modified code.
>> I want you to check certain parts of the code. (I have added capitalized comments just before the parts which I want you to check.)
>> 
>> Thank you.
>> 
>> 
>> Regards,
>> Malhar
>> ᐧ
>> 
>> On Wed, Apr 19, 2017 at 12:11 PM, Artem Dergachev <noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com> <mailto:noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com>>> wrote:
>> 
>>    For completeness, i'd point out that there's a simpler alternative
>>    approach to this: force the analyzer to split the state
>>    immediately on pthread_mutex_destroy(): construct a state where
>>    the mutex is destroyed and symbol is known-to-be-zero, construct a
>>    state where the mutex remains the same and symbol is
>>    known-to-be-non-zero, and addTransition to both of them. This
>>    forces the analyzer to investigate two concrete states in
>>    parallel, rather than one "Schrödinger" state that would still be
>>    split when the return value is checked.
>> 
>>    However, that approach pays a huge price of doubling analysis time
>>    - the time we would spend in the "Schrödinger" state in the
>>    original approach would be doubled here (and if the return value
>>    is truly unchecked, that may go on until the end of the analysis,
>>    because even after the symbol dies the mutex state is still
>>    different, so the paths wouldn't be merged). This is why we try to
>>    avoid state splits unless it is definitely necessary.
>> 
>> 
>>    On 4/19/17 9:22 AM, Artem Dergachev wrote:
>> 
>>        Yep!
>> 
>>        You are right that simply unlocking the failed-to-be-destroyed
>>        mutex in checkDeadSymbols is not enough. The reason for that
>>        is, the moment when the symbol dies is quite unpredictable,
>>        and we don't know how many things have happened to the mutex
>>        between the failed destroy and the symbol death - it might
>>        have been locked and unlocked many times in between.
>> 
>>        I'd propose that in order to maintain the correct mutex state
>>        you'd need to check the destruction-return-value symbol in
>>        every operation as well. For instance, when somebody tries to
>>        lock a mutex, see if it was destroyed previously. If it was
>>        destroyed, look at the symbol:
>>        * If the destruction was checked and failed on that execution
>>        path (as you can find out by looking at the symbol's
>>        constraints in the program state), then it's ok, remove that
>>        symbol from the program state and mark the mutex as locked.
>>        * If the destruction cannot be proven to have failed on this
>>        path, then either it is known to have succeeded, or
>>        destruction wasn't checked for failure properly on this path.
>>        It means we have a use-after-destroy, and we report the
>>        warning accordingly.
>> 
>>        At the same time, checkDeadSymbols is still necessary: if no
>>        mutex state changes happen after destroy before symbol death,
>>        we need to collect the information from the dying symbol.
>>        Because the symbol is dying, this information will no longer
>>        ever be made more accurate (that is, constraints wouldn't ever
>>        become more strict anymore), so it's as accurate as possible.
>>        So we can be sure if the symbol should be truly set to
>>        destroyed or reverted to the previous state.
>> 
>>        Most path-sensitive checkers are kind of state machines (see
>>        also "typestate"). They assign a state to an object, and then
>>        various events (coming from checker callbacks) affect this
>>        state in one way or another. You can draw the state diagram.
>> 
>>        So my proposal is to have the following states: Initialized,
>>        Locked, Unlocked, SchrödingerInitialized, SchrödingerLocked,
>>        SchrödingerUnlocked, Destroyed. Schrödinger states indicate
>>        that we're having a Schrödinger mutex that is alive and
>>        destroyed at the same time on different execution paths. The
>>        mutex becomes a Schrödinger mutex after
>>        pthread_mutex_destroy() and remains there until the box is
>>        opened^W^W^W^W either we try to conduct more operations on the
>>        mutex, or the return-symbol of pthread_mutex_destroy() is
>>        dead. At such moments we collapse the mutex state to either
>>        the previous state (eg. SchrödingerLocked -> Locked)  or to
>>        Destroyed. The easiest way to represent Schrödinger states
>>        would be something like "still Locked, but with
>>        destroy-return-symbol set present for this mutex"; upon
>>        collapse we remove the return symbol.
>> 
>>        On 4/19/17 8:54 AM, Malhar Thakkar wrote:
>> 
>>            Dear Dr. Artem,
>> 
>>            Thank you for your response. After incorporating
>>            'addTransition', I was able to suppress the
>>            double-destroy. But, when 'pthread_mutex_destroy' was
>>            called for the second time (i.e., inside the if branch),
>>            LockState was found to be NULL (which it should be
>>            according to my code) but ideally it should be in the
>>            Unlocked state. For, it to be in Unlocked state, I think
>>            state->set<LockMap>() should be executed inside
>>            checkDeadSymbols() instead of executing it inside
>>            AcquireLock(), ReleaseLock(), DestroyLock() and InitLock()
>>            as I am unable to find a way to do it otherwise.
>> 
>>            Regards,
>>            Malhar
>>            ᐧ
>> 
>>            On Tue, Apr 18, 2017 at 8:30 PM, Artem Dergachev
>>            <noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com> <mailto:noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com>>
>>            <mailto:noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com> <mailto:noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com>>>>
>>            wrote:
>> 
>>                Before anything, please note that `State' is your
>>            local variable;
>>                assigning to `State' doesn't immediately affect
>>            analysis in any
>>                way. In order to make your changes to the program
>>            state take
>>                effect, you'd need to put the changed state back into the
>>                `CheckerContext' by calling its `addTransition'
>>            method, like other
>>                checker callbacks do.
>> 
>>                Otherwise, your code looks reasonable to me. If
>>            everything is
>>                implemented correctly, i believe that your code would
>>            already
>>                suppress the double-destroy warning on this branch.
>> 
>> 
>>                On 4/18/17 5:53 PM, Malhar Thakkar wrote:
>> 
>>                    Test-case:
>> 
>>                    pthread_mutex_lock(&mtx1);
>>                    pthread_mutex_unlock(&mtx1);
>>                    int retval = pthread_mutex_destroy(&mtx1);
>>                    if(retval != 0){
>>                    pthread_mutex_destroy(&mtx1); (1)
>>                    }
>>                    else
>>                    printf("Already destroyed\n");
>> 
>> 
>>                    Regarding my previous query about reverting the
>>            LockState, how
>>                    would I ensure that the LockState at (1) is
>>            Unlocked and not
>>                    Destroyed?
>> 
>>                    Following is the code for checkDeadSymbols()
>>                    void
>>            PthreadLockChecker::checkDeadSymbols(SymbolReaper &SymReaper,
>>                                       CheckerContext &C) const {
>>                      ProgramStateRef State = C.getState();
>>                      ConstraintManager &CMgr =
>>            State->getConstraintManager();
>> 
>>                      RetValConstraintTy TrackedSymbols =
>>                    State->get<RetValConstraint>();
>>                      for (RetValConstraintTy::iterator I =
>>            TrackedSymbols.begin(),
>>                         E = TrackedSymbols.end(); I != E; ++I) {
>>                        SymbolRef Sym = I->second;
>>                        const MemRegion* lockR = I->first;
>>                        bool IsSymDead = SymReaper.isDead(Sym);
>>                        const LockState* LState =
>>            State->get<LockMap>(lockR);
>>                        // Remove the dead symbol from the return
>>            value symbols map.
>>                        if (IsSymDead){
>>                          ConditionTruthVal retZero =
>>            CMgr.isNull(State, Sym);
>>                    if(retZero.isConstrainedFalse()){
>>                            // Update LockMap
>>                            State = State->remove<LockMap>(lockR); //
>>            I know this
>>                    is incorrect but even after removing this entry                                 from the map, (1) raises a warning
>>            saying, "This lock has
>>                    already been destroyed".
>>                          }
>>                          State = State->remove<RetValConstraint>(lockR);
>>                        }
>>                      }
>>                    }
>> 
>>                    Thank you.
>> 
>> 
>>                    Regards,
>>                    Malhar
>> 
>> 
>>                    ᐧ
>> 
>>                    On Mon, Apr 17, 2017 at 8:34 PM, Devin Coughlin
>>                    <dcoughlin at apple.com <mailto:dcoughlin at apple.com> <mailto:dcoughlin at apple.com <mailto:dcoughlin at apple.com>>
>>            <mailto:dcoughlin at apple.com <mailto:dcoughlin at apple.com> <mailto:dcoughlin at apple.com <mailto:dcoughlin at apple.com>>>
>>                    <mailto:dcoughlin at apple.com <mailto:dcoughlin at apple.com>
>>            <mailto:dcoughlin at apple.com <mailto:dcoughlin at apple.com>> <mailto:dcoughlin at apple.com <mailto:dcoughlin at apple.com>
>>            <mailto:dcoughlin at apple.com <mailto:dcoughlin at apple.com>>>>> wrote:
>> 
>> 
>>                            On Apr 17, 2017, at 4:55 AM, Malhar Thakkar
>>                        <cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>
>>            <mailto:cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>> <mailto:cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>
>>            <mailto:cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>>>
>>                            <mailto:cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>
>>            <mailto:cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>>
>>                        <mailto:cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>
>>            <mailto:cs13b1031 at iith.ac.in <mailto:cs13b1031 at iith.ac.in>>>>> wrote:
>> 
>>                            Consider the following test-case:
>> 
>>                            int retval = pthread_mutex_destroy(&mtx1);
>>                            if(retval == 0){
>>                            pthread_mutex_lock(&mtx1);
>>                            }
>> 
>> 
>>                                       REGISTER_MAP_WITH_PROGRAMSTATE(RetValConstraint, const
>>                        MemRegion
>>                            *, SymbolRef)
>> 
>>                            I have added the following snippet for
>>            checkDeadSymbols:
>>                            void
>>            PthreadLockChecker::checkDeadSymbols(SymbolReaper
>>                        &SymReaper,
>>                             CheckerContext &C) const {
>>                              // std::cout<<"Checking dead symbols\n";
>>                            ProgramStateRef State = C.getState();
>>                            ConstraintManager &CMgr =
>>            State->getConstraintManager();
>> 
>>                            RetValConstraintTy TrackedSymbols =
>>                        State->get<RetValConstraint>();
>>                              // int counter = 0;
>>                              for (RetValConstraintTy::iterator I =
>>                        TrackedSymbols.begin(),
>>                                                   E =
>>            TrackedSymbols.end(); I !=
>>                        E; ++I) {
>>                            // counter++;
>>                            // std::cout << "Counter:
>>            "<<counter<<std::endl;
>>                            SymbolRef Sym = I->second;
>>                            const MemRegion* lockR = I->first;
>>                            bool IsSymDead = SymReaper.isDead(Sym);
>> 
>>                            // Remove the dead symbol from the return
>>            value
>>                        symbols map.
>>                            if (IsSymDead){
>>                            ConditionTruthVal retZero =
>>            CMgr.isNull(State, Sym);
>>                            if(retZero.isConstrainedFalse()){
>>                              std::cout<<"False\n";
>>                              // Update LockMap
>>                            }
>>                            else if(retZero.isConstrainedTrue()){
>>                              std::cout<<"True\n";
>>                            }
>>                            State =
>>            State->remove<RetValConstraint>(lockR);
>>                            std::cout<<"Removed\n";
>>                                }
>>                              }
>>                            }
>> 
>>                            Now, after the execution of
>>                        PthreadLockChecker::DestroyLock(),
>>                            checkDeadSymbols() is executed twice before
>>                            PthreadLockChecker::AcquireLock() is executed.
>> 
>>                            When checkDeadSymbols is first executed,
>>            there is just
>>                        one symbol
>>                            in RetValConstraint(trivial). But, this
>>            symbol is
>>                        constrained
>>                            /not /to be NULL. Then, the symbol is
>>            removed. During the
>> 
>>                            execution of checkDeadSymbols for the
>>            second time,
>>                        there is again
>>                            one symbol in RetValConstraint(I don't
>>            understand
>>                        this. Shouldn't
>>                            this symbol have been destroyed during the
>>            previous
>>                        execution?).
>>                            Nevertheless, this time, this symbol is
>>            constrained to
>>                        be NULL.
>> 
>> 
>>                        The static analyzer performs a path-sensitive
>>            exploration
>>                    of the
>>                        program. This means that when it sees a branch
>>            it will explore
>>                        both both sides of the branch independently
>>            (unless it can
>>                    tell
>>                        that one side is infeasible). In this case it
>>            means the
>>                    analyzer
>>                        will explore the path when retval is 0 and
>>            when it is not.
>>                    If you
>>                        haven’t read it yet, the “Debugging” section
>>            the the checker
>>                        development manual describes some tricks for
>>            how to
>>                    visualize the
>>                        analyzer’s exploration.
>>            <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>
>>            <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>>
>>            <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>
>>            <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>>>
>>            <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands
>>            <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>
>>            <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands
>>            <https://clang-analyzer.llvm.org/checker_dev_manual.html#commands>>>>.
>>                        I find the debug.ViewExplodedGraph checker
>>            particularly
>>                    helpful.
>> 
>>                            Also, I have one more query. After fixing
>>            the above
>>                        mentioned
>>                            issue, how do I update LockMap in the
>>            sense that I'll
>>                        need to
>>                            revert the change made to it if the symbol is
>>                        constrained not to
>>                            be NULL. Apart from keeping track of the
>>            previous
>>                        LockState, is
>>                            there any other way to do it?
>> 
>> 
>>                        Since the analyzer keeps its representation of
>>            the program
>>                    state
>>                        per-path and explores paths independently, you
>>            shouldn’t
>>                    need to
>>                        do any reverting to undo the effects of
>>            simulating other
>>                    paths to
>>                        reaching a particular program point.
>> 
>>                        Devin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170420/0297ce84/attachment.html>