[cfe-dev] [analyzer][GSoC] Problem Statement: Improving BugReporter with static backward program slicing

Thu Apr 4 14:39:54 PDT 2019

This. Is. Amazing. Thank you so much for taking the time to assemble this
mail! I agree with everything stated above -- I think I'm as ready as I'll
ever be to put together a nice looking proposal.

On Thu, 4 Apr 2019 at 22:22, Artem Dergachev <noqnoqneo at gmail.com> wrote:

> I kinda gathered some understanding of the problem and the solution and
> trying to come up with some sort of a plan, and for that purpose i'm trying
> to step back from "we should totally do this cool technique" towards a more
> "adult", "boring" point of view that consists in asking, "what *do* we need
> to do?".
>
> For now i'm seeing that the work on this problem can be decomposed into
> three chunks:
>
> (1) Collect slicing criteria from various sources,
> (2) Perform slicing to discover data and control dependencies,
> (3) Use information about dependencies to improve the bug report.
>
> Chunk (3) sounds straightforward: we just enable or disable path pieces
> based on information we already have. I don't fully understand how it'll
> go, but i don't see any unknowns here.
>
> Chunk (1) sounds like a large amount of routine, repetitive work: convert
> sources of slicing criteria, such as checkers, to the new system
> one-by-one. There is certain amount of known unknowns here: some sources
> may be tricky to take advantage of (eg., liveness-related sources), some
> may put stress on the checkers.
>
> Chunk (2) is the research-intensive part: it has an easy 80% solution (AST
> matching based detection that has already been implemented and proved
> useful for one of the checkers) and an idea to improve upon it by
> implementing a more sophisticated analysis.
>
> Because chunk (2) is research-intensive, we need experimental data that we
> currently don't have. In order to avoid spending a lot of time on something
> that isn't going to work, we should try to gather that data as soon as
> possible in our plan.
>

Umm, what do you mean under "data"? Like, proving how the examples I showed
would be solved by this technique?

> Because chunk (1) is repetitive, we don't need to implement it completely;
> we can pick one or two sources and get them right.
>
> Therefore here's the plan that emerges:
> i. Pick one or two slicing criteria sources and implement them.
> ii. Implement a simple AST-based algorithm for discovering dependencies.
> iii. Apply the information produced by that algorithm to the bug report.
> iv. Run the updated analyzer on real-world codebases and see what problems
> remain.
> v. Implement solutions to these problems.
>
> Solutions on step v. may or may not be about coding a more sophisticated
> slicing algorithm. They might as well be improvements in chunk (1), i.e.
> discovering more slicing criteria, or they may be about a "horizontal"
> improvement in the existing primitive algorithm (eg., cover more forms of
> control flow or different variants of data dependencies, which should have
> been a separate chunk).
>
> That's roughly how i imagine it.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190404/08c9d6e2/attachment.html>