[cfe-dev] [analyzer] Tracking values for generating bug reports

Mon Jun 24 16:48:35 PDT 2019

I went ahead and (although admittedly with a very crude implementation) I
enhanced condition tracking with the following:

* All notes about their initialization are discarded
* No longer tracking the right hand side of the last store
* Last stores are discarded unless it happened in another stack frame

I evaluated the prototype solution on LLVM+Clang+Clang-tools-extra, Xerces
and Bitcoin. I might add rtags and YouCompleteMe later, but I think this is
sufficient for now:

http://cc.elte.hu:15001/Default/#

Due to the lack of better categorization, I used the following (so this has
in fact no relation to whether the report is correct or not):
* Intentional reports (green x) are the ones where the bugreport got
unquestionably worse.
* Confirmed reports (red check mark) got definitely better
* False positives (grey line) are in between.

I didn't have time to look at them all, but here are some reports I found
interesting on LLVM:

http://cc.elte.hu:15001/Default/#is-unique=on&run=LLVM%2FClang%2FClang-tools-extra%20AFTER%20tracking%20conditions&review-status=Confirmed&review-status=False%20positive&review-status=Intentional&detection-status=New&detection-status=Reopened&detection-status=Unresolved&tab=LLVM%2FClang%2FClang-tools-extra%20AFTER%20tracking%20conditions

By clicking on a bug report, and changing the run with the dropdown menu
found in the upper right corner (next to "Also found in:"), you can see
that in some cases these enhancements kept the amount of extra notes at bay
quite well.

.* BEFORE condition tracking: Analysis made on LLVM monorepo commit
4cc6d72bb4d
.* AFTER condition tracking: With https://reviews.llvm.org/D62883 applied
and condition tracking enabled.
.* AFTER condition tracking + DEBUG notes: With
https://reviews.llvm.org/D63642 applied and condition tracking with debug
notes enabled.
.* AFTER condition tracking WITH moderate tracking: With the above
mentioned additions applied on top of https://reviews.llvm.org/D62883 and
condition tracking enabled.
.* AFTER condition tracking WITH moderate tracking + DEBUG notes: With the
above mentioned additions applied on top of https://reviews.llvm.org/D63642 and
condition tracking with debug notes enabled.

On Wed, 19 Jun 2019 at 02:08, Artem Dergachev <adergachev at apple.com> wrote:

>
>
> On Jun 18, 2019, at 5:02 PM, Kristóf Umann <dkszelethus at gmail.com> wrote:
>
>
>
> On Wed, 19 Jun 2019 at 01:14, Artem Dergachev <noqnoqneo at gmail.com> wrote:
>
>>
>>
>> On 6/18/19 3:44 PM, Gábor Horváth wrote:
>>
>> Hi Kristóf,
>>
>> Thanks for the report. I have been thinking about what kind of user
>> experience should we pursue. While my ideas might be a bit opinionated, see
>> my comments inline.
>>
>> On Tue, 18 Jun 2019 at 20:47, Kristóf Umann <dkszelethus at gmail.com>
>> wrote:
>>
>>> Hi!
>>>
>>> This is an update on my GSoC project, "Enhancing bug reports in the
>>> Clang Static Analyzer". We had some discussions on phabricator, and some in
>>> private meetings, and I think it's due to share how things are looking
>>> right now.
>>>
>>> In my previous mail [1], I detailed the two distinct categories we
>>> defined, the "must-have-happened" case where every information is already
>>> in the bug path, and the "should-not-have-happened" case when the
>>> information isn't found in the bugpath. I divided the latter into two
>>> subcategories, the "Inlined category" (initially referred to as "category
>>> 1"), and the "Not inlined category" (initally referred to as "category 2").
>>>
>>> These categorizations are used to teach the analyzer incrementally about
>>> which nodes in the bugpath deserve special attention. For now, I plan to
>>> use this information exclusively for finding control dependencies to these
>>> nodes, and tracking their condition. Now, what we mean under "tracking an
>>> expression value", is that to add notes to the bug report relevant to that
>>> expression value.
>>>
>>> Ultimately, this means that my project depends greatly on condition
>>> tracking yielding meaningful addition of notes to the bug report, without
>>> adding unimportant ones. Since I more-or-less finished my work on the
>>> must-have-happened case (meaning that the analyzer can now figure out
>>> control dependencies to nodes contained in the bugpath), I'd like to detail
>>> how I plan to work on this.
>>>
>>> While evaluating an early prototype solution to the "must-have-happened"
>>> case where the same expression value tracking was used for both the
>>> bug-causing variable and for the conditions, I found that in many cases,
>>> the growth of bug length was intolerable. This is, in part, caused by
>>> conditions being tracked to a condition recursively, the conditions of
>>> asserts being tracked, and that notes about a condition are not as
>>> interesting as notes about the bug causing variable (calls to operator bool
>>> for instance).
>>>
>>> Fixing any of these requires me to teach the analyzer the difference in
>>> between "THE value" and "just a condition". The details are a little more
>>> complicated, so I'll show some examples that point out certain cases:
>>>
>>> *Example 1.:*
>>>
>>> 01 int flag;
>>> 02 bool coin();
>>> 03
>>> 04 void foo() {
>>> 05   flag = coin();
>>> 06 }
>>> 07
>>> 08 int main() {
>>> 09   int *x = 0;
>>> 10   foo();
>>> 11   if (flag) // assumed true
>>> 12     *x = 5; // warn
>>> 13 }
>>>
>>> In this example, it'd be great to see notes placed on line 10 and 5,
>>> because if flag wasn't invalidated, the bug would not have occurred (since
>>> flag is global, and is initialized to 0). The analyzer normally wouldn't
>>> place notes there, so we definitely should track flag up to line 5.
>>>
>>> *Example 2.:*
>>>
>>> 01 int flag;
>>> 02 bool coin();
>>> 03
>>> 04 void foo() {
>>> 05   coin();
>>> 06 }
>>> 07
>>> 08 int main() {
>>> 09   int *x = 0;
>>> 10   foo();
>>> 11   if (flag) // assumed true
>>> 12     *x = 5; // warn
>>> 13 }
>>>
>>> This case is very similar, with the only difference being that the
>>> analyzer conservatively assumed that coin may have written flag (as it's a
>>> global variable). We should track until line 5.
>>>
>>
>> I am not sure about this example. While the invalidation of the global
>> flag is definitely playing a role in this bug report the point where the
>> flag is actually invalidated could seem quite arbitrary for the user. We
>> might get the invalidation when we first see a function that is not defined
>> in this TU, when we reach the maximum call stack depth and so on. My point
>> is, I am not sure if we actually help the user by pinpointing the first
>> function call where the invalidation happens. A more user friendly way to
>> think about this problem is how could the user solve this false positive?
>> As far as I understand the cannonical solution would be to add an
>> assertion, so it would be better to put a node close to the place where the
>> user would change the code, so I would put it in the stack frame of the
>> bug. For example we could generate a note for the call foo(), that
>> somewhere in that call stack the analyzer could no longer track the value
>> of flag, thus invalidated its contents. @Artem Dergachev
>> <adergachev at apple.com> what do you think?
>>
>>
>> This is definitely one of the most annoying kinds of false positives we
>> have due to infeasible paths and over-eager assumption chains. Like, you
>> can add an assertion here, but it's going to look super ugly:
>>
>>   bool tmp_flag = flag;
>>   coin(); // why would anybody believe it touches flag in the first
>> place???
>>   assert(flag == tmp_flag && "sanity check");
>>
>> And even if the user would actually like to add something like that to
>> his code, it would require a powerful constraint solver to even handle this
>> assertion correctly.
>>
>> These are not very common, but they're so annoying that i probably
>> wouldn't be against marking reports as invalid immediately when we see that
>> an important control dependency relies on such invalidation (we'll have to
>> gather experimental data on that, of course).
>>
>>
> Hmm, yea, good point. But if I coin were to take flag as a non-const
> reference (other than the fact that flag is global) I guess we would like
> to keep the report and place notes all the way there? I agree with Artem,
> we really need to see the real world impact of this.
>
>
> Yup.
>
>
>
>>
>>>
>>> *Example 3.:*
>>>
>>> 01 void f(int flag) {
>>> 02   int *x = 0;
>>> 03   if (flag) // assumed true
>>> 04     *x = 5; // warn
>>> 05 }
>>>
>>> Here, the user could simply follow the arrows that shows the path of
>>> execution: it isn't really important what flag was initialized on the first
>>> line.
>>>
>>> *Example 4.:*
>>>
>>> 01 int flag;
>>> 02
>>> 03 int getInt();
>>> 04
>>> 05 int main() {
>>> 06   int *x = 0;
>>> 07   int y = getInt();
>>> 08   flag = y;
>>> 09   if (flag) // assumed true
>>> 10     *x = 5; // warn
>>> 11 }
>>>
>>> Again, the user could see there was a write made to flag on line 8 --
>>> the question is, whether we should explain y better. Right now, we're
>>> thinking that we shouldn't, leading to the conclusion that flag here
>>> shouldn't be tracked at all.
>>>
>>
>> Finding the place where flag was modified in a really big function can
>> still be challenging. While generating a note would be an overkill I wonder
>> if some middle ground like highlighting the statement without additional
>> text would actually make sense for this case.
>>
>> I think its a matter of a simple comparison of the stack frames that
> would decide whether we only want to track the condition if it was in
> between the collapse point and the initialization in a nested stack frame,
> or regardless of the stack frame. Let's see just see which yields the
> better result!
>
>>
>> When it's in the same function, you can simply search for the variable
>> name to see all writes into it. It's not that hard. Even if the variable is
>> written into by pointer, you can see that its address is taken and search
>> for the pointer variable as well (though annoying if there are too many
>> levels of indirection).
>>
>> But, hmm, if the address is taken in a different stack frame, this
>> becomes much harder. I guess, writes that are made through pointers should
>> be highlighted, and we should also do our tracking to explain why do we
>> think that this pointer points to that variable.
>>
>>
> Ah yes, our arch enemy, pointer analysis showed up again. I guess I agree
> with you, but I plan on approaching aliasing related issues when I got most
> of the other stuff working.
>
>
> In this case it's much more trivial because you're simply relying on the
> information about pointers that's already contained in the graph.
>
>
>>
>>
>>>
>>> *Example 5.:*
>>>
>>> 01 int flag;
>>> 02
>>> 03 int getInt();
>>> 04
>>> 05 void foo() {
>>> 06   int y = getInt();
>>> 07   flag = y;
>>> 08 }
>>> 09
>>> 10 int main() {
>>> 11   int *x = 0;
>>> 12 foo();
>>> 13   if (flag) // assumed true
>>> 14     *x = 5; // warn
>>> 15 }
>>>
>>> Like Example 1-2, we should explain that flag was written on line 7, but
>>> like in Example 4., we shouldn't track y.
>>>
>>
>> Again, from the user's point of view, I think the ultimate solution would
>> be to have an interface, where we does not present the part where y was
>> tracked by default, but the user could click on a button to actually expand
>> that part of the path in case she consider it interesting. This, of course,
>> is not part of you GSoC, I am just wondering what is the ideal that we
>> would like to pursue in the long run.
>>
>>
>> A more powerful UI could have indeed solved a lot of these problems by
>> telling users to do our work. But given that it comes with a fairly large
>> cost of developing such UIs for every tool that integrates the Static
>> Analyzer, i'd definitely try to push our own effort of stuffing exactly as
>> much information as necessary into the report as far as possible.
>>
>>
> I actually had a number of discussions with Dániel Krupp about this, and
> folks in Ericsson seem to be very interested in this direction. This could
> be a great compliment to my GSoC as a followup project.
>
>>
>>
>>>
>>> *Example 6.:*
>>> 01 void f(int flag) {
>>> 02   int *x = 0;
>>> 03 assert(flag);
>>> 04   if (flag) // assumed true
>>> 05     *x = 5; // warn
>>> 06 }
>>>
>>> Currently, we mention nothing about line 3, yet we say "Taking the true
>>> branch" on line 4, rather then "Assuming the condition is true". This is
>>> because the collapse point (point at which we constrain flag's value to be
>>> true or false) isn't on line 4: the analysis only continued past the assert
>>> since the analyzer assumed flag to be non-zero. In this case, we would like
>>> to track flag to the assert to explain why we are so confident on line 5
>>> that flag is non-zero.
>>>
>>
>> What do you mean by tracking tha flag back to the assert? The reason why
>> a note is useful because in this case either the assert is wrong or the bug
>> is a true positive. But again, when the assert is in the same function we
>> might not want to generate additional note for this as this info might
>> easily be inferred by following the arrows in the report (but we might
>> highlight the assertion line, see my previous comments). In case the assert
>> is in a separate (inlined) function it would make much more sense to
>> generate a note.
>>
>>
>> Wouldn't the assert already contain a (prunable) note saying "Assuming
>> 'flag' is true"? I guess the only problem here is that the note is
>> prunable, so it won't be shown if the assert is wrapped into a nested
>> function call.
>>
>> Yup, with -analyzer-config prune-paths=false, you get the following
> report, but it's still not explained well why flag is non-zero (the thing
> that I'm missing is the connection in between b and flag):
>
> $ cat example_6.cpp
>
> [[noreturn]] void halt();
>
> void assert(int b) {
>   if (!b)
>     halt();
> }
>
> void f(int flag) {
>   int *x = 0;
>   assert(flag);
>   if (flag) // assumed true
>     *x = 5; // warn
> }
>
> $ clang -cc1 -analyze -analyzer-output=text -analyzer-checker=core
> example_6.cpp -analyzer-config prune-paths=false
>
> example_6.cpp:10:3: note: 'x' initialized to a null pointer value
>   int *x = 0;
>   ^~~~~~
> example_6.cpp:11:3: note: Calling 'assert'
>   assert(flag);
>   ^~~~~~~~~~~~
> example_6.cpp:5:7: note: Assuming 'b' is not equal to 0
>   if (!b)
>       ^~
> example_6.cpp:5:3: note: Taking false branch
>   if (!b)
>   ^
> example_6.cpp:11:3: note: Returning from 'assert'
>   assert(flag);
>   ^~~~~~~~~~~~
> example_6.cpp:12:7: note: 'flag' is not equal to 0
>   if (flag) // assumed true
>       ^~~~
> example_6.cpp:12:3: note: Taking true branch
>   if (flag) // assumed true
>   ^
> example_6.cpp:13:8: note: Dereference of null pointer (loaded from
> variable 'x')
>     *x = 5; // warn
>      ~ ^
>
> As you said, this example is interesting because assert isn't a macro
> here, and flag's collapse point really happens in a nested stack frame. Am
> I understanding correctly that maybe we shouldn't pursue tracking a
> variable if the collapse point happened in the same stack frame?
>
>>
>>
>>>
>>> *Example 7.:*
>>>
>>> 01 int getInt();
>>> 02 03 void f(int flag) {
>>> 04   int *x = 0;
>>> 05 flag = getInt();
>>> 06 assert(flag);
>>> 07   if (flag) // assumed true
>>> 08     *x = 5; // warn
>>> 09 }
>>>
>>> Like Example 6, we should explain why know that flag is non-zero on line
>>> 7 by tracking it back to line 6. Like in the case of Example 4., the user
>>> could see where flag was written, so we wouldn't like to see a note on line
>>> 5.
>>>
>>> So what's the takeaway?
>>>
>>> After teaching the analyzer the difference in between a condition and a
>>> "regularly tracked expression", I plan to implement the following two rules:
>>>
>>> Track a condition only if
>>> a.) The collapse point doesn't coincide with the condition point
>>> b.) It was written in a nested stack frame.
>>>
>>
>> Do you mane a) or b), or a) and b)? Also, what to do with "multiple
>> collapse point" events like:
>>
>> If either happens.
>
>>
>> if ( a < 10 ) ;
>> if ( a < 5) // Here we do add new constraints to a, but we also had other
>> constraints before. Do you consider this to coincide or not?
>>  ...
>>
>>
>> I'd rather track the condition *until* we cover those points (which
>> implies not tracking it at all when there are no b.)-points and the
>> a.)-point coincide with the current terminator), rather than track it
>> forever if we have those points.
>>
>> By collapse point i meant the point where the (bool)condition collapses
>> to a constant-false or a constant-true (in the true case the condition
>> itself doesn't necessarily collapse to a constant 1). It doesn't happen
>> every time we introduce a constraint.
>>
>> Yup, that's what I was planning to go for.
>
>>
>>
>>
>>>
>>> We hope that by implementing this, tracking conditions to conditions
>>> would be kept at bay without a counter to limit the depth of recursion, and
>>> the intolerable growth of bug length with drastically shorten. I do expect
>>> skeletons to fall out of the closet, but I am confident that this is a good
>>> initial approach.
>>>
>>> As explained earlier, this is mission number one, so I'll prioritize
>>> getting it right before pursuing the "should-not-have-happened" case.
>>>
>>> One thing I did not touch on just yet, is the case where an assert was
>>> (correctly, by the way) regarded as a control dependency, and it's
>>> condition was tracked. This is undoubtedly undesirable, but figuring out
>>> whether the condition is found in an assert is rather difficult. Asserts
>>> are often implemented as a macro, and could have a very, for a lack of a
>>> better word, esoteric implementations on certain platforms. We discussed
>>> trying to tinker with the control dependency calculator, namely skipping
>>> over nodes that have two successors and one of them leads to noreturn node,
>>> but I still need time to figure out something reliable.
>>>
>>
>> Some asserts consists of more basic blocks. Having two successors where
>> one of them is a noreturn block is not a sufficient condition. Consider
>> for example assert(a && b); Here the basic block for evaluating a will
>> either go to the next line after assertion or to the evaluation of b which
>> is not a noreturn block. While it is true that from node a we will
>> either go to the next non-assert block or end up in a noreturn block, the
>>  noreturn block might not be an immediate successor of node a.
>>
>>
>> Forgot to mention, i have a history with the CFG for asserts that's
>> mostly contained in https://reviews.llvm.org/D28023 and
>> https://reviews.llvm.org/D35673. This is the reason why i believe that
>> some implementation of assert() have a lot of CFG blocks on their own,
>> regardless of how many CFG blocks does it take to evaluate the assert
>> condition.
>>
>> Thanks! :D Right now, incorrect tracking of assert conditions isn't on
> the top of my priority list, but I'll think about this when I'm sitting on
> the bus or watering the garden.
>
>>
>> Regards,
>> Gabor
>>
>>
>>>
>>> Thanks to everyone who helped me along: Artem Dergachev, Gábor Horváth,
>>> Ádám Balogh, Jakub Kuderski, and Zoltán Porkoláb!
>>>
>>> Cheers,
>>> Kristóf
>>>
>>> [1]  http://lists.llvm.org/pipermail/cfe-dev/2019-June/062535.html
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190625/0462d73c/attachment.html>