[PATCH] D53076: [analyzer] WIP: Enhance ConditionBRVisitor to write out more information

Wed Oct 10 14:51:50 PDT 2018

NoQ added a comment.

In https://reviews.llvm.org/D53076#1260663, @george.karpenkov wrote:

> 1. Note that "Assuming X" directives are useful for the analyzer developers, since they know that's where the checker makes arbitrary assumptions, but to end users that mostly feels like noise ("Taking true branch" is there already, why there should be "Assuming 'i' is > 0" as well?)

I believe that distinction between "Assuming..." (event piece, yellow bar in scan-build) and "Taking..." (control-flow piece, grey bar in scan-build, depicted with arrows //without text// in a number of other UIs) is useful for the users as well, not just for hackers. I agree that the distinction isn't obvious and for now this part of the UI is not very useful. But now that you're at it, i think it'd be better to fix this problem by making diagnostics more understandable rather than by simplifying out the distinction.

For example, in the `inline-plist.c`'s `bar()` on line 45, Static Analyzer indeed doesn't assume that `p` is equal to null; instead, Static Analyzer *knows* it for sure. There's no guess made here, and that's not an assumption that the user would need to agree or disagree with while looking at the report. Instead, it is an inevitable consequence of the previous events that occurred on this path. So i guess we could say something like "Knowing that 'p' is equal to null" or "'p' is inevitably null" and it should make the distinction apparent to the user. The user would notice that there's a change in the way we're talking about the fact.

The other reason why it's important is that those arbitrary assumptions are one of the fundamental weakness of the technique behind Static Analyzer: the user is indeed allowed to disagree with these assumptions and then mark the positive as false and suppress it with an assertion. In a code with a single branch such approach works fine because it is based on "presumption of no deadcode" (i.e., if there's an `if` in the code, both branches should be reached sometimes), but when there are N consequent branches, it is not necessary for all 2^N possible execution paths to be feasible: an O(N) number of paths can cover all the branches. But when it comes to actual facts that are inevitably true after the user has agreed with all previous assumptions on the path, the user can't disagree with those facts anymore, and that's an important piece of info to convey.

In https://reviews.llvm.org/D53076#1260663, @george.karpenkov wrote:

> 2. @NoQ do you know why the GDM comparison was there in the first place? The commit was made by Ted in 2011, maybe constraint changes had to be reflected in the GDM at that point (?)

It's likely that back then GDM only contained constraints and checker info (and program point kind guaranteed that checker info could not have changed). But that's not the case anymore (we have more core traits - dynamic type info, C++ constructor support, taint, etc.), so this code is definitely incorrect; not sure how often does it matter. In order to produce an actual correct logic, we probably need to add a method to `ConstraintManager` to ask whether constraints have changed between two states.

https://reviews.llvm.org/D53076