[clang] [analyzer] Suppress out of bounds reports after weak loop assumptions (PR #109804)

Thu Oct 10 06:26:25 PDT 2024

=?utf-8?q?Donát?= Nagy <donat.nagy at ericsson.com>,
=?utf-8?q?Donát?= Nagy <donat.nagy at ericsson.com>,
=?utf-8?q?Donát?= Nagy <donat.nagy at ericsson.com>,
=?utf-8?q?Donát?= Nagy <donat.nagy at ericsson.com>,
=?utf-8?q?Donát?= Nagy <donat.nagy at ericsson.com>,
=?utf-8?q?Donát?= Nagy <donat.nagy at ericsson.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/109804 at github.com>


================
@@ -121,6 +121,34 @@ struct EvalCallOptions {
   EvalCallOptions() {}
 };
 
+/// Simple control flow statements like `if` can only produce a single two-way
+/// state split, so when the analyzer cannot determine the value of the
+/// condition, it can assume either of the two options, because the fact that
+/// they are in the source code implies that the programmer thought that they
+/// are possible (at least under some conditions).
+/// (Note that this heuristic is not entirely correct when there are _several_
+/// `if` statements with unmarked logical connections between them, but it's
+/// still good enough and the analyzer heavily relies on it.)
+/// In contrast with this, a single loop statement can produce multiple state
+/// splits, and we cannot always single out safe assumptions where we can say
+/// that "the programmer included this loop in the source code, so they clearly
+/// thought that this execution path is possible".
+/// However, the analyzer wants to explore the code in and after the loop, so
+/// it makes assumptions about the loop condition (to get a concrete execution
+/// path) even when they are not justified.
+/// This function is called by the engine to mark the `State` when it makes an
+/// assumption which is "weak". Checkers may use this heuristical mark to
+/// discard the result and reduce the amount of false positives.
+/// TODO: Instead of just marking these branches for checker-specific handling,
+/// we could discard them completely. I suspect that this could eliminate some
+/// false positives without suppressing too many true positives, but I didn't
+/// have time to measure its effects.
+ProgramStateRef recordWeakLoopAssumption(ProgramStateRef State);
+
+/// Returns true if `recordWeakLoopAssumption()` was called on the execution
+/// path which produced `State`.
+bool seenWeakLoopAssumption(ProgramStateRef State);
----------------
isuckatcs wrote:

While browsing through the codebase, I've seen that these methods for the other traits are defined inside `class ExprEngine` as `static` methods with setters and deleters being `private` and getters being `public`. 

IIUC, they are in the global scope so that they can be accessed from `ArrayBoundCheckerV2`. Tbh, I would make `recordWeakLoopAssumption()` a `private` `static` method of the `ExprEngine` class, as only this class is able to assume anything about the condition, so it should be the only one, that can set any state traits related to those assumptions, while `seenWeakLoopAssumption()` could be a `public` `static` member function of `ExprEngine` to follow the pattern.


https://github.com/llvm/llvm-project/pull/109804