[clang] [analyzer] Improve handling of unsigned values in ArrayBoundCheckerV2 (PR #81034)

via cfe-commits cfe-commits at lists.llvm.org
Mon Feb 19 15:24:03 PST 2024


NagyDonat wrote:

The main effect of this commit is that it eliminates ~300-400 ArrayBoundV2 reports. I didn't review each of them, but I checked dozens of them and those were all "Out of bound access to memory after the end of the region" false positives that tried to access something at index `-1` through an opaque pointer.

This is a logical consequence of the commit:
-  for pointers that are not understood by the analyzer `getDynamicExtent()` conjures an extent symbol which is handled as a `size_t` value; 
- before this commit this extent symbol was blindly compared with the offset;
- when the offset is `-1` it was converted to SIZE_MAX by `evalBinOpNN` which evaluates the comparison according to the C++ rules;
- even if the extent symbol was totally unconstrained, `evalBinOpNN` was able to determine that it's not larger than SIZE_MAX;
- so it reported an _overflow_ error when `-1` was used as an array index.
- Now the new shortcut realizes that `-1` is negative, so it's certainly smaller than an unsigned value (the extent symbol), and these situations are not reported as overflow errors.
(Note that `-1` is very significant here: for other negative numbers we only get an assumption that the extent symbol must be huge to accommodate that index.)

I'm surprised that these false positives were so prevalent previously; but then it's good news that this change eliminates them.

There are also lots of lost ArrayBound (V1) reports (e.g. 120 of them on ffmpeg, there are a few others on other projects as well), but I didn't analyze those too closely, because I hope that I can remove that checker in the foreseeable future. The few ones that I checked were difficult to understand / potentially false positives.

There were also two CastSize reports that no longer appear after this commit [(1)](https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=ffmpeg_n4.3.1_without_arrayboundv2_generalize_negative_vs_unsigned&newcheck=ffmpeg_n4.3.1_with_arrayboundv2_generalize_negative_vs_unsigned&diff-type=Resolved&is-unique=on&report-id=3473303&report-hash=8e6de29986e3b683337815887e8b1c03&report-filepath=%2ah263.c), [(2)](https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=ffmpeg_n4.3.1_without_arrayboundv2_generalize_negative_vs_unsigned&newcheck=ffmpeg_n4.3.1_with_arrayboundv2_generalize_negative_vs_unsigned&diff-type=Resolved&is-unique=on&report-id=3474439&report-hash=d7855d83835dab440c2395c0933dd152&report-filepath=%2avp9recon.c), but they were confusing, messy reports on complicated code, so I don't think that losing them is a significant effect.
 


https://github.com/llvm/llvm-project/pull/81034


More information about the cfe-commits mailing list