[cfe-dev] [analyzer][GSoC] Problem Statement: Improving BugReporter with static backward program slicing

Kristóf Umann via cfe-dev cfe-dev at lists.llvm.org
Thu Apr 4 14:52:15 PDT 2019


Alright, cheers! I'll take my time to put something really nice together by
tomorrow.

On Thu, 4 Apr 2019 at 23:48, Artem Dergachev <noqnoqneo at gmail.com> wrote:

> > Umm, what do you mean under "data"? Like, proving how the examples I
> showed would be solved by this technique?
>
> I roughly refer-ahead-to the output of step iv. - it should do the trick.
> See if we've already solved the problem at step iii. If not, take a few
> real-world reports that are still bad, try to understand what's wrong with
> them. Eg., are we not stuffing enough slicing criteria into them? Or is a
> more sophisticated slicing algorithm necessary to discover some of the
> dependencies? Or are we instead discovering too many dependencies? Or is it
> a completely unrelated bug that's suddenly causing havoc? Rinse and repeat
> until most of our reports look great :)
>
> On 4/4/19 2:39 PM, Kristóf Umann wrote:
>
> This. Is. Amazing. Thank you so much for taking the time to assemble this
> mail! I agree with everything stated above -- I think I'm as ready as I'll
> ever be to put together a nice looking proposal.
>
> On Thu, 4 Apr 2019 at 22:22, Artem Dergachev <noqnoqneo at gmail.com> wrote:
>
>> I kinda gathered some understanding of the problem and the solution and
>> trying to come up with some sort of a plan, and for that purpose i'm trying
>> to step back from "we should totally do this cool technique" towards a more
>> "adult", "boring" point of view that consists in asking, "what *do* we need
>> to do?".
>>
>> For now i'm seeing that the work on this problem can be decomposed into
>> three chunks:
>>
>> (1) Collect slicing criteria from various sources,
>> (2) Perform slicing to discover data and control dependencies,
>> (3) Use information about dependencies to improve the bug report.
>>
>> Chunk (3) sounds straightforward: we just enable or disable path pieces
>> based on information we already have. I don't fully understand how it'll
>> go, but i don't see any unknowns here.
>>
>> Chunk (1) sounds like a large amount of routine, repetitive work: convert
>> sources of slicing criteria, such as checkers, to the new system
>> one-by-one. There is certain amount of known unknowns here: some sources
>> may be tricky to take advantage of (eg., liveness-related sources), some
>> may put stress on the checkers.
>>
>> Chunk (2) is the research-intensive part: it has an easy 80% solution
>> (AST matching based detection that has already been implemented and proved
>> useful for one of the checkers) and an idea to improve upon it by
>> implementing a more sophisticated analysis.
>>
>> Because chunk (2) is research-intensive, we need experimental data that
>> we currently don't have. In order to avoid spending a lot of time on
>> something that isn't going to work, we should try to gather that data as
>> soon as possible in our plan.
>>
>
> Umm, what do you mean under "data"? Like, proving how the examples I
> showed would be solved by this technique?
>
>
>> Because chunk (1) is repetitive, we don't need to implement it
>> completely; we can pick one or two sources and get them right.
>>
>> Therefore here's the plan that emerges:
>> i. Pick one or two slicing criteria sources and implement them.
>> ii. Implement a simple AST-based algorithm for discovering dependencies.
>> iii. Apply the information produced by that algorithm to the bug report.
>> iv. Run the updated analyzer on real-world codebases and see what
>> problems remain.
>> v. Implement solutions to these problems.
>>
>> Solutions on step v. may or may not be about coding a more sophisticated
>> slicing algorithm. They might as well be improvements in chunk (1), i.e.
>> discovering more slicing criteria, or they may be about a "horizontal"
>> improvement in the existing primitive algorithm (eg., cover more forms of
>> control flow or different variants of data dependencies, which should have
>> been a separate chunk).
>>
>> That's roughly how i imagine it.
>>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190404/d3ad90a0/attachment.html>


More information about the cfe-dev mailing list