[cfe-dev] [analyzer] Rough sketch of the algorithm for "Enhancing bug reporting with static backward program slicing"
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Wed May 22 17:29:48 PDT 2019
My primary question is, how do you plan to use the data that you obtain
via your analysis?
Like, do you plan to backtrack the value of all relevant variables
and/or expressions in the final bug report that were also encountered
along the bug path? If yes, then is it possible that the slicing
criterion gets updated in the process and you'll have to re-run the
CFG-based analysis to take it into account?
> What would also be really great is to assist this traversal with the
information the analyzer already has, essentially only inspecting basic
blocks the analyzer actually visits.
You mean visits on the current bug path or visits on the equivalence
class of bugs or visits in general during analysis?
Regardless, for many problems ideally we should traverse basic blocks
that the analyzer *didn't* visit (eg., if the function didn't initialize
the variable on the current path, we're interested in parts of the code
in which it *did* initialize the variable, even though we didn't
necessarily have time to visit them).
It actually sounds as if all problems that we're trying to solve here
can be classified into "must-have-happened" problems (eg., "variable
must-have been declared without initialization", "variable must-have
been set to nullptr", "memory must-have been allocated") and
"should-not-have-happened" problems (eg., "variable should-not-have been
initialized", "null value should-not-have been overwritten", "pointer
should-not-have escaped").
For must-have-happened problems, i'm recently under an impression that
we should suppress the bug report entirely when we fail to solve them
(i.e., if fail to explain to the user where exactly does this happen,
then the report is impossible to understand anyway). This is currently a
very big problem for our null dereference checker on some codebases,
especially because it uses this tracking for suppressions as well (aka
inlined defensive checks), so when it fails to track the variable it's
likely a legit false positive as well, not simply a hard-to-understand
report.
For should-not-have-happened problems i'm much more confused. We're
talking about looking for places where it *could* have happened and then
trying to explain to the user why none of them have actually happened.
I'm not sure what are the consequences of failing to explain to the user
why didn't a particular piece of code do something, because i've no idea
how do users intuitively figure out which code *could* have done these
things and which clearly couldn't.
On 5/22/19 4:11 PM, Kristóf Umann wrote:
>
> Hi!
>
>
> I'd like to share some thoughts about my GSoC project, "Enhancing bug
> reporting with static backward program slicing"[1].
>
>
> My proposal is very specific about whatI'm aiming to improve on, but
> vague on the howpart of it. This mail aims to add clarify some of this.
>
>
> Program slicing is essentially a technique of discovering data and
> control dependencies to the slicing criterion, which is a (statement,
> {set of variables}) pair. Fortunately, tools for control dependencies,
> namely, dominator set calculations are already implemented, but seems
> to be unstable with clang's CFG. It would be a great tool if I were
> able to fix it.
>
>
> While my proposal states that I plan to implement an AST-based
> solution, I'm actually not sure that this is the optimal approach. We
> could instead inspect CFG blocks in a backwards manner (coupled with
> the analyzer's call stack), and gather relevant variable in the meanwhile.
>
>
> What would also be really great is to assist this traversal with the
> information the analyzer already has, essentially only inspecting
> basic blocks the analyzer actually visits.
>
>
> Here’s my idea for an algorithm (from the point of the slicing
> criterion already being constructed):
>
> https://docs.google.com/document/d/1Lx867o3meyQsj0WKOSWMdosSBdw2MUq1aIxyPJM6iLU/edit?usp=sharing
>
>
>
> Please note that this is a veeery rough sketch, I didn't think about
> every edge case that exists, but I think it's an okay start.
>
>
> [1]
> https://docs.google.com/document/d/1ci1BCAKojPlqIxIw1J_K2dnATA3z01PuwR_vHJS55TI/edit
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190522/dee16939/attachment.html>
More information about the cfe-dev
mailing list