[llvm-dev] [cfe-dev] Soundness in clang SA
Artem Dergachev via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 24 16:12:49 PDT 2018
On 8/24/18 5:31 AM, Siddharth Shankar Swain wrote:
> Thanks for the detailed explanation. So is changing the analysis
> technique from path sensitive (symbolic execution) analysis to some
> other technique bring in the soundness property ?
Yup, depending on the technique.
And that wouldn't be Analyzer-specific anymore.
> Thanks,
> Siddharth
>
> On Sat, Aug 18, 2018 at 1:27 AM, Artem Dergachev <noqnoqneo at gmail.com
> <mailto:noqnoqneo at gmail.com>> wrote:
>
> Nope, at least not with the "path-sensitive" (symbolic execution)
> engine. Which is the whole point of using the Analyzer, as
> compared to writing an arbitrary tool by exploring the AST
> yourself. Depends on what you want to check, of course, something
> really trivial should be possible.
>
> While we do care about false positives a lot, there are some
> decisions within the Analyzer's core that prevent perfect
> soundness from happening but are still considered to be good
> because they greatly boost our ability to find any bugs at all and
> at the same time simplify development dramatically.
>
> One of the most common unsound assumptions that the Analyzer makes
> is that in a program that looks like this: "if (x) { ... } if (y)
> { ... }", where 'x' and 'y' are completely unrelated to each
> other, the Analyzer would explore all four paths (i.e., (x is true
> or false) times (y is true or false)) as if it's sure that they
> are feasible. This is not only unsound on its own, but it
> amplifies other inaccuracies in modeling (eg., if function body is
> unavailable for analysis, destroying information when such
> function is called would lead not only to false negatives but to
> false positives as well). But these false positives are relatively
> easy to suppress (eg., by adding some sort of assert(x || !y),
> which is in most cases anyway a good thing to document), and the
> primary benefit of this simplification is that checkers no longer
> need to implement a "merge" function for the information that they
> track in the program state: they only need to update their state
> by modeling effect of every non-control-flow event in the program.
> With ~100 checkers already written, limiting them to only simple
> operations on the program state helps dramatically.
>
> Additionally, the Analyzer's understanding of the language is not
> perfect. We're constantly working on this, but languages like C++
> are *huge*. We have a separation of responsibilities between core
> and checkers that allow us to write most transfer functions once
> within the core and have checkers simply consume the results of
> the modeling (checkers may still want to model their own
> metadata), but covering all AST nodes and their interactions is
> still, well, harder than writing a CodeGen. Any bugs in core
> transfer functions will prevent you from doing a sound analysis,
> and making sure all such bugs are fixed is probably much more work
> than whatever you plan to accomplish here.
>
> On the non-technical side, it is always up to a developer of the
> program to decide what is a genuine bug. For example, there are a
> lot of projects on which a memory leak is not considered to be a
> bug. Or if an application is not security-critical, a null pointer
> dereference we find may be on such an unlikely path that the
> developer will only be annoyed to know about it and never really
> appreciate the report. And the Analyzer doesn't even guarantee
> that it'd find anything that's more severe than a dead code: after
> all, the only reason symbolic execution works is "state splits are
> justified because otherwise it's dead code".
>
> Last but not least, there's also the "Clang CFG" thing, which the
> Analyzer uses (and some compiler warnings also use), but it's not
> used for compilation (LLVM CFG is used instead) and it's not
> entirely accurate (especially for C++, though it's getting
> better). Clang CFG is usable for creating usable data flow
> analyses which you can make as sound as the CFG is accurate.
> There's a collection of such analyses in lib/Analysis. The
> Analyzer uses some of them internally and we're very happy with
> them most of the time. At the same time, there's no fancy
> framework for creating custom data flow analyses, like there is
> for the Analyzer's checkers; you'll have to write all transfer
> functions yourself. So if all you want is a ready-made CFG for C,
> you should have a look at that.
>
>
> On 8/17/18 4:35 AM, Siddharth Shankar Swain via cfe-dev wrote:
>> Hi all,
>>
>> Is it possible to develop a checker or some feature in clang SA
>> which will only have perfect soundness property ( if we don't
>> care about completness property ) i.e if the analyzer says X is
>> a genuine bug then X is really a genuine bug. Whatever bug it
>> reports are all genuine but it doesn't report all genuine bugs.
>> Please guide.
>>
>> Thanks,
>> Siddharth
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180824/f7388068/attachment.html>
More information about the llvm-dev
mailing list