[llvm-dev] [cfe-dev] Soundness in clang SA

Siddharth Shankar Swain via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 24 05:31:34 PDT 2018


Thanks for the detailed explanation. So is changing the analysis technique
from path sensitive (symbolic execution) analysis to some other technique
bring in the soundness property ?
Thanks,
Siddharth

On Sat, Aug 18, 2018 at 1:27 AM, Artem Dergachev <noqnoqneo at gmail.com>
wrote:

> Nope, at least not with the "path-sensitive" (symbolic execution) engine.
> Which is the whole point of using the Analyzer, as compared to writing an
> arbitrary tool by exploring the AST yourself. Depends on what you want to
> check, of course, something really trivial should be possible.
>
> While we do care about false positives a lot, there are some decisions
> within the Analyzer's core that prevent perfect soundness from happening
> but are still considered to be good because they greatly boost our ability
> to find any bugs at all and at the same time simplify development
> dramatically.
>
> One of the most common unsound assumptions that the Analyzer makes is that
> in a program that looks like this: "if (x) { ... } if (y) { ... }", where
> 'x' and 'y' are completely unrelated to each other, the Analyzer would
> explore all four paths (i.e., (x is true or false) times (y is true or
> false)) as if it's sure that they are feasible. This is not only unsound on
> its own, but it amplifies other inaccuracies in modeling (eg., if function
> body is unavailable for analysis, destroying information when such function
> is called would lead not only to false negatives but to false positives as
> well). But these false positives are relatively easy to suppress (eg., by
> adding some sort of assert(x || !y), which is in most cases anyway a good
> thing to document), and the primary benefit of this simplification is that
> checkers no longer need to implement a "merge" function for the information
> that they track in the program state: they only need to update their state
> by modeling effect of every non-control-flow event in the program. With
> ~100 checkers already written, limiting them to only simple operations on
> the program state helps dramatically.
>
> Additionally, the Analyzer's understanding of the language is not perfect.
> We're constantly working on this, but languages like C++ are *huge*. We
> have a separation of responsibilities between core and checkers that allow
> us to write most transfer functions once within the core and have checkers
> simply consume the results of the modeling (checkers may still want to
> model their own metadata), but covering all AST nodes and their
> interactions is still, well, harder than writing a CodeGen. Any bugs in
> core transfer functions will prevent you from doing a sound analysis, and
> making sure all such bugs are fixed is probably much more work than
> whatever you plan to accomplish here.
>
> On the non-technical side, it is always up to a developer of the program
> to decide what is a genuine bug. For example, there are a lot of projects
> on which a memory leak is not considered to be a bug. Or if an application
> is not security-critical, a null pointer dereference we find may be on such
> an unlikely path that the developer will only be annoyed to know about it
> and never really appreciate the report. And the Analyzer doesn't even
> guarantee that it'd find anything that's more severe than a dead code:
> after all, the only reason symbolic execution works is "state splits are
> justified because otherwise it's dead code".
>
> Last but not least, there's also the "Clang CFG" thing, which the Analyzer
> uses (and some compiler warnings also use), but it's not used for
> compilation (LLVM CFG is used instead) and it's not entirely accurate
> (especially for C++, though it's getting better). Clang CFG is usable for
> creating usable data flow analyses which you can make as sound as the CFG
> is accurate. There's a collection of such analyses in lib/Analysis. The
> Analyzer uses some of them internally and we're very happy with them most
> of the time. At the same time, there's no fancy framework for creating
> custom data flow analyses, like there is for the Analyzer's checkers;
> you'll have to write all transfer functions yourself. So if all you want is
> a ready-made CFG for C, you should have a look at that.
>
>
> On 8/17/18 4:35 AM, Siddharth Shankar Swain via cfe-dev wrote:
>
> Hi all,
>
> Is it possible to develop a checker  or some feature in clang SA which
> will only have perfect soundness property ( if we don't care about
> completness property )  i.e if the analyzer says X is a genuine bug then X
> is really a genuine bug. Whatever  bug  it reports are all genuine but it
> doesn't report all genuine bugs. Please guide.
>
> Thanks,
> Siddharth
>
> _______________________________________________
> cfe-dev mailing listcfe-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180824/aa83f355/attachment.html>


More information about the llvm-dev mailing list