[cfe-dev] exhaustiveness of CSA checkers

Wed Jan 15 15:17:44 PST 2020

Thanks, Artem. Lots of useful flags. Unfortunately I have already experimented with all of them though.

> -----Original Message-----
> From: Artem Dergachev <noqnoqneo at gmail.com>
> Sent: Wednesday, January 15, 2020 14:36
> To: Fernandez, Matthew <matthew.fernandez at intel.com>; Gábor Horváth
> <xazax at google.com>
> Cc: cfe-dev at lists.llvm.org
> Subject: Re: [cfe-dev] exhaustiveness of CSA checkers
> 
> Yup, i mostly agree with Gabor. If you want to hack on the Clang Static
> Analyzer in order to push it past its default limits, you can try bumping the
> following flags (intended for hacking purposes only!):
> 
>    // Interrupts the analysis when a CFG block is visited that many times.
>    -analyzer-max-loop=4
> 
>    // Interrupts the analysis when ExplodedGraph has that many nodes.
>    -analyzer-config max-nodes=225000
> 
> and setting the following flags (intended for hacking purposes only!):
> 
>    // Allow unrolling loops indefinitely when the concrete bound is known
> (currently off by default).
>    -analyzer-config unroll-loops=true
> 
>    // Disables function inlining (you said you don't need IPA).
>    -analyzer-config ipa=none
> 
> and also removing the artificial heuristics for loop unrolling (that attempt
> to discover whether the loop is statically bounded in LoopUnrolling.cpp). That
> would give you the most complete exploration the Static Analyzer could ever
> achieve. There may be more flags that i forgot about, but the above should be
> pretty good.
> 
> It is still impossible to achieve hardcore *verification* this way. The
> Analyzer will occasionally drop execution paths for many other reasons, and
> these reasons are fairly hard to enumerate. Like, it may encounter exotic
> language constructs that it still doesn't yet understand, or simply becomes
> too confused to continue, or it might turn out that an execution path looks
> infeasible to the Static Analyzer because of a bug but it may be taken in
> reality. At the end you will never be sure that the program is definitely
> correct.
> 
> But if you simply want to find "most" of the bugs, for a certain definition of
> "most", the above should do the trick.
> 
> 
> On 1/15/20 9:17 PM, Fernandez, Matthew wrote:
> >
> > Thanks, Gabor. Sounds like this might be beyond CSA’s abilities.
> > Answers to your very apt questions below.
> >
> > /> How much control-flow awareness do you need? Do you really need
> > path-sensitivity or flow-sensitive is sufficient? Or maybe lexical
> > scoping is enough?/
> >
> > It seems to me currently that we need path-sensitivity. I’ve been
> > exploring some lexical approaches in parallel though, including an
> > RAII-style rephrasing of the code.
> >
> > /> Do you need interprocedural analysis? If so, do you have recursion?
> > Do you need context sensitivity? Can you add annotations to help guide
> > the analysis?/
> >
> > No IPA necessary and there’s no recursion. In most cases we can ignore
> > context, even to the extent of ignoring the content of a conditional
> > and just noting there’s a branch in control flow (with the exception
> > of when a branch condition depends on a lock acquisition result).
> > Annotations are possible, but if it comes to this I would probably
> > look at more drastic refactoring of the code. CSA would not be the
> > only thing consuming this code, so it would still need to be correct
> > and complete without the annotations.
> >
> > /> How complex is the task that you want to accomplish? Are locks
> > reentrant? Do you have to support more complex try_lock style APIs? Or
> > is it sufficient to only check the order of the API calls?/
> >
> > The locks are not re-entrant, though their /only/ APIs are
> > try_lock-style. So the analysis needs to comprehend whether lock
> > acquisition succeeded or failed.
> >
> > /> you could take a look at Thread Safety/
> >
> > Interesting, I was not aware of this. It looks like maybe I can make
> > this work for my purposes. Thanks for the pointer.
> >
> > *From:* Gábor Horváth <xazax at google.com>
> > *Sent:* Wednesday, January 15, 2020 09:15
> > *To:* Fernandez, Matthew <matthew.fernandez at intel.com>; Artem
> > Dergachev <noqnoqneo at gmail.com>
> > *Cc:* cfe-dev at lists.llvm.org
> > *Subject:* Re: [cfe-dev] exhaustiveness of CSA checkers
> >
> > (Adding Artem as he is very knowledgeable in this topic)
> >
> > Oh, I see. In case it is known that you have a bounded number of paths
> > it is not entirely unreasonable to use symbolic execution to achieve
> > what you want.
> >
> > Unfortunately, this is not a use-case that the static analyzer was
> > designed for. I think it should be possible to tweak it but I have no
> > idea how much work would that be.
> >
> > But even though it might be possible to tweak the analyzer I am not
> > sure if this would be the right thing to do. Some questions that might
> > help:
> >
> > 1. How much control-flow awareness do you need? Do you really need
> > path-sensitivity or flow-sensitive is sufficient? Or maybe lexical
> > scoping is enough?
> >
> > You only need path sensitive check if you want to avoid false
> > positives in the form of:
> >
> > if (cond)
> >   lock();
> >
> > // ...
> >
> > if (cond)
> >
> >   unlock();
> >
> > It looks like you already have some constraints on the coding style in
> > the code you want to check. So I guess there is a chance that users
> > are not allowed to do locking using complex patterns like the one
> > above. If that is the case, flow-sensitive analysis might be a better
> > fit as it is easier to make that exhaustive and will perform much better.
> >
> > Or in case RAII style locking would be sufficient but you do not have
> > dtors in C, you can have syntactic checks that enforce hand-written
> > RAII style resource management.
> >
> > 2. Do you need interprocedural analysis? If so, do you have recursion?
> > Do you need context sensitivity? Can you add annotations to help guide
> > the analysis?
> >
> > 3. How complex is the task that you want to accomplish? Are locks
> > reentrant? Do you have to support more complex try_lock style APIs? Or
> > is it sufficient to only check the order of the API calls?
> >
> > In case you can add annotations and you do not need path sensitivity
> > you could take a look at Thread Safety Analysis:
> > https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
> >
> > Cheers,
> >
> > Gabor
> >
> > On Wed, Jan 15, 2020 at 8:48 AM Fernandez, Matthew
> > <matthew.fernandez at intel.com <mailto:matthew.fernandez at intel.com>> wrote:
> >
> > Hi Gabor,
> >
> > Thanks for your reply. The checker I’m implementing is similar to
> > PthreadLockChecker. It knows the correct acquire/release patterns for
> > certain primitives and checks for them. If analysis fails to reach the
> > end of a function, the checker cannot warn for e.g. unreleased locks.
> >
> > This is a somewhat unorthodox case as I know the target code to which
> > this will be applied. All functions are <500LoC and the only loops are
> > statically bounded. It is observable statically that all functions
> > terminate and there are a finite number of paths.
> >
> > I was hoping to use CSA for this because it handles path enumeration
> > and constructing the exploded graph very nicely. Someone suggested to
> > me I might have to move to KLEE, but that would be a shame because I’d
> > need to introduce some code instrumentation/annotation to achieve what
> > I want. Another option would be to use an AST visitor to enumerate the
> > paths myself, but it would be nice to leverage LLVM’s existing
> > functionality for this.
> >
> > Thanks,
> >
> > Matthew
> >
> > *From:* Gábor Horváth <xazax at google.com <mailto:xazax at google.com>>
> > *Sent:* Wednesday, January 15, 2020 08:13
> > *To:* Fernandez, Matthew <matthew.fernandez at intel.com
> > <mailto:matthew.fernandez at intel.com>>
> > *Cc:* cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> > *Subject:* Re: [cfe-dev] exhaustiveness of CSA checkers
> >
> > Hi!
> >
> > The clang static analyzer does not give you any guarantees regarding
> > the coverage/exhaustiveness. There is no way to ensure exhaustive
> > analysis (such analysis is likely to be unbounded for most non-trivial
> > programs, so this is not only about runtime, but also termination).
> > For this reason all the checks have to be implemented with
> > non-exhaustiveness in mind.
> >
> > Could you share what you are trying to achieve? Maybe symbolic
> > execution is not the right tool for that problem.
> >
> > Cheers,
> >
> > Gabor
> >
> > On Wed, Jan 15, 2020 at 12:58 AM Fernandez, Matthew via cfe-dev
> > <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
> >
> > Hello cfe-dev,
> >
> > In prototyping a custom checker for the Clang Static Analyzer, I’ve
> > found analysis terminates at some complexity limit. That is, when your
> > target function exceeds some complexity bound, CSA stops path
> > traversal and your checker does not receive callbacks for any
> > remaining unvisited nodes. The two specific scenarios where I’ve run
> > into this are high-iteration-count loops and complex conditionals
> > (multiple short circuiting && and || operators). The first I can work
> > around by rephrasing the target loops or something like
> > -analyzer-max-loop, but I can’t find a way to affect the behavior of
> > the second. To compound the situation, I cannot see how the checker
> > can detect that path exploration was incomplete.
> >
> > Is there a way to control the complexity limit enforced for
> > conditionals? Or, failing that, to detect within the checker when path
> > exploration was incomplete?
> >
> > To give some more context, my checker is an experiment and not
> > something I am intending to upstream. Runtime is not an issue; I am
> > fine with the analyzer taking multiple hours for a single run. Though
> > I understand why the existing CSA bound choices have been made, as
> > most users do not want their compiler to run for this long.
> >
> > Please CC me in replies as I’m not subscribed.
> >
> > Thanks,
> >
> > Matthew
> >
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> >