[cfe-dev] [analyzer] Bug post-processing questions

Mon Mar 5 14:55:28 PST 2018

On 05/03/2018 2:30 PM, George Karpenkov via cfe-dev wrote:
> Hi Réka,
>
>> I was thinking about whether the whole ExplodedGraph needs to be 
>> re-built if we want to post-process bugs occured in its realm.
>
> Generally all of post-processing is done in the 
> `BugReporterVisitors.cpp` file,
> with general entry point being `trackNullOrUndefValue` function for 
> many checkers.
>
> The bug reporter visitor gets access to the last node in the exploded 
> graph, but can generally very easily retrace the past,
> by going up the chain of predecessors.
>
>> After looking at the EGs of some code snippets analyzed using the 
>> default solver and then Z3, it seems to me that constraints, but also 
>> environment and store contents are changing, so one guess would be a yes.
>
> In general, I think it should be possible to get a good increase in 
> precision just by re-analyzing the existing exploded graph.
> From my understanding, many constraints are different because the 
> constraint manager tries to be as efficient as possible,
> and destroys all constraints which can not be processed by the current 
> solver.

I suspect that simply not destroying constraints on symbol that we 
cannot handle would already be a slight increase in precision. Because 
at least we can take the same path when the same symbol we cannot handle 
is encountered twice. Eg.:

if (x^3 + y^3 == z^3) {
   // assume any branch, mark symbol (($x)^3 + ($y)^3 == ($z)^3)
   // as [ 0, 0 ] or [-INT_MAX, -1] U [1, INT_MAX] respectively
}

if (x^3 + y^3 == z^3) {
   // take the same branch that has been taken
   // on the previous if.
}

Which leads me into believing that regardless of the solver, keeping 
those around is a good thing. I guess they are only removed for 
performance reasons, not because keeping them around was causing false 
positives. And i'm in favor of limiting symbol complexity to avoid 
performance problems (this way we'd handle all obvious cases, but if the 
arithmetic is ridiculously complex - the user would be more likely to 
forgive us).

> If a further increase in precision is needed, it should not be very 
> difficult to change that destroying logic to keep those constraints as 
> well.
>
>> But again, completely re-building EGs might not be far in slow-down 
>> from analyzing with Z3 from the beginning, depending on what portion 
>> of the code is buggy.
>
> Yes, that’s why post-processing is done.
>
>> So one could work with the information already present in the graph 
>> (this might need telling the analyzer core not to throw away stuff it 
>> does not understand when the false positive refutation option is 
>> turned on, if such thing happens?), and re-evaluate those, perhaps 
>> around the time we call FindReportInEquivalenceClass().
>
> As mentioned before, it’s better to do those in your own visitor 
> inside BugReporterVisitors.cpp.
> Visitors get the `BugReport` object, and can call 
> `BugReport::markInvalid` if invalidation is needed.
>
> Regards,
> George
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev