[cfe-dev] exhaustiveness of CSA checkers

Wed Jan 15 10:17:29 PST 2020

Thanks, Gabor. Sounds like this might be beyond CSA’s abilities. Answers to your very apt questions below.

> How much control-flow awareness do you need? Do you really need path-sensitivity or flow-sensitive is sufficient? Or maybe lexical scoping is enough?
It seems to me currently that we need path-sensitivity. I’ve been exploring some lexical approaches in parallel though, including an RAII-style rephrasing of the code.

> Do you need interprocedural analysis? If so, do you have recursion? Do you need context sensitivity? Can you add annotations to help guide the analysis?
No IPA necessary and there’s no recursion. In most cases we can ignore context, even to the extent of ignoring the content of a conditional and just noting there’s a branch in control flow (with the exception of when a branch condition depends on a lock acquisition result). Annotations are possible, but if it comes to this I would probably look at more drastic refactoring of the code. CSA would not be the only thing consuming this code, so it would still need to be correct and complete without the annotations.

> How complex is the task that you want to accomplish? Are locks reentrant? Do you have to support more complex try_lock style APIs? Or is it sufficient to only check the order of the API calls?
The locks are not re-entrant, though their only APIs are try_lock-style. So the analysis needs to comprehend whether lock acquisition succeeded or failed.

> you could take a look at Thread Safety
Interesting, I was not aware of this. It looks like maybe I can make this work for my purposes. Thanks for the pointer.

From: Gábor Horváth <xazax at google.com>
Sent: Wednesday, January 15, 2020 09:15
To: Fernandez, Matthew <matthew.fernandez at intel.com>; Artem Dergachev <noqnoqneo at gmail.com>
Cc: cfe-dev at lists.llvm.org
Subject: Re: [cfe-dev] exhaustiveness of CSA checkers

(Adding Artem as he is very knowledgeable in this topic)

Oh, I see. In case it is known that you have a bounded number of paths it is not entirely unreasonable to use symbolic execution to achieve what you want.

Unfortunately, this is not a use-case that the static analyzer was designed for. I think it should be possible to tweak it but I have no idea how much work would that be.

But even though it might be possible to tweak the analyzer I am not sure if this would be the right thing to do. Some questions that might help:

1. How much control-flow awareness do you need? Do you really need path-sensitivity or flow-sensitive is sufficient? Or maybe lexical scoping is enough?

You only need path sensitive check if you want to avoid false positives in the form of:
if (cond)
  lock();
// ...
if (cond)
  unlock();

It looks like you already have some constraints on the coding style in the code you want to check. So I guess there is a chance that users are not allowed to do locking using complex patterns like the one above. If that is the case, flow-sensitive analysis might be a better fit as it is easier to make that exhaustive and will perform much better.
Or in case RAII style locking would be sufficient but you do not have dtors in C, you can have syntactic checks that enforce hand-written RAII style resource management.

2. Do you need interprocedural analysis? If so, do you have recursion? Do you need context sensitivity? Can you add annotations to help guide the analysis?

3. How complex is the task that you want to accomplish? Are locks reentrant? Do you have to support more complex try_lock style APIs? Or is it sufficient to only check the order of the API calls?

In case you can add annotations and you do not need path sensitivity you could take a look at Thread Safety Analysis: https://clang.llvm.org/docs/ThreadSafetyAnalysis.html

Cheers,
Gabor

On Wed, Jan 15, 2020 at 8:48 AM Fernandez, Matthew <matthew.fernandez at intel.com<mailto:matthew.fernandez at intel.com>> wrote:
Hi Gabor,

Thanks for your reply. The checker I’m implementing is similar to PthreadLockChecker. It knows the correct acquire/release patterns for certain primitives and checks for them. If analysis fails to reach the end of a function, the checker cannot warn for e.g. unreleased locks.

This is a somewhat unorthodox case as I know the target code to which this will be applied. All functions are <500LoC and the only loops are statically bounded. It is observable statically that all functions terminate and there are a finite number of paths.

I was hoping to use CSA for this because it handles path enumeration and constructing the exploded graph very nicely. Someone suggested to me I might have to move to KLEE, but that would be a shame because I’d need to introduce some code instrumentation/annotation to achieve what I want. Another option would be to use an AST visitor to enumerate the paths myself, but it would be nice to leverage LLVM’s existing functionality for this.

Thanks,
Matthew

From: Gábor Horváth <xazax at google.com<mailto:xazax at google.com>>
Sent: Wednesday, January 15, 2020 08:13
To: Fernandez, Matthew <matthew.fernandez at intel.com<mailto:matthew.fernandez at intel.com>>
Cc: cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] exhaustiveness of CSA checkers

Hi!

The clang static analyzer does not give you any guarantees regarding the coverage/exhaustiveness. There is no way to ensure exhaustive analysis (such analysis is likely to be unbounded for most non-trivial programs, so this is not only about runtime, but also termination). For this reason all the checks have to be implemented with non-exhaustiveness in mind.

Could you share what you are trying to achieve? Maybe symbolic execution is not the right tool for that problem.

Cheers,
Gabor

On Wed, Jan 15, 2020 at 12:58 AM Fernandez, Matthew via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
Hello cfe-dev,

In prototyping a custom checker for the Clang Static Analyzer, I’ve found analysis terminates at some complexity limit. That is, when your target function exceeds some complexity bound, CSA stops path traversal and your checker does not receive callbacks for any remaining unvisited nodes. The two specific scenarios where I’ve run into this are high-iteration-count loops and complex conditionals (multiple short circuiting && and || operators). The first I can work around by rephrasing the target loops or something like -analyzer-max-loop, but I can’t find a way to affect the behavior of the second. To compound the situation, I cannot see how the checker can detect that path exploration was incomplete.

Is there a way to control the complexity limit enforced for conditionals? Or, failing that, to detect within the checker when path exploration was incomplete?

To give some more context, my checker is an experiment and not something I am intending to upstream. Runtime is not an issue; I am fine with the analyzer taking multiple hours for a single run. Though I understand why the existing CSA bound choices have been made, as most users do not want their compiler to run for this long.

Please CC me in replies as I’m not subscribed.

Thanks,
Matthew
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200115/323e1a63/attachment-0001.html>