[clang] [analyzer] Suppress out of bounds reports after weak loop assumptions (PR #109804)

DonĂ¡t Nagy via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 2 08:41:22 PDT 2024


================
@@ -194,3 +199,99 @@ char test_comparison_with_extent_symbol(struct incomplete *p) {
   return ((char *)p)[-1]; // no-warning
 }
 
+// WeakLoopAssumption suppression
+///////////////////////////////////////////////////////////////////////
+
+int GlobalArray[100];
+int loop_suppress_after_zero_iterations(unsigned len) {
+  for (unsigned i = 0; i < len; i++)
+    if (GlobalArray[i] > 0)
+      return GlobalArray[i];
+  // Previously this would have produced an overflow warning because splitting
+  // the state on the loop condition introduced an execution path where the
+  // analyzer thinks that len == 0.
+  // There are very many situations where the programmer knows that an argument
+  // is positive, but this is not indicated in the source code, so we must
+  // avoid reporting errors (especially out of bounds errors) on these
+  // branches, because otherwise we'd get prohibitively many false positives.
+  return GlobalArray[len - 1]; // no-warning
+}
+
+void loop_report_in_second_iteration(int len) {
+  int buf[1] = {0};
+  for (int i = 0; i < len; i++) {
+    // When a programmer writes a loop, we may assume that they intended at
+    // least two iterations.
+    buf[i] = 1; // expected-warning{{Out of bound access to memory}}
+  }
+}
+
+void loop_suppress_in_third_iteration(int len) {
+  int buf[2] = {0};
+  for (int i = 0; i < len; i++) {
+    // We should suppress array bounds errors on the third and later iterations
+    // of loops, because sometimes programmers write a loop in sitiuations
+    // where they know that there will be at most two iterations.
+    buf[i] = 1; // no-warning
+  }
+}
+
+void loop_suppress_in_third_iteration_cast(int len) {
+  int buf[2] = {0};
+  for (int i = 0; (unsigned)(i < len); i++) {
----------------
NagyDonat wrote:

I thought about this question, and I realized that in fact my `IgnoreParenCasts()` call cannot cause trouble, because its result is only used in a very limited way: the only reference to it is that it's compared to `EagerlyAssumeExpr`. (The `assumeCondition` call happens above this.)

The only effect of ignoring the cast is that this way I can recognize that the loop condition expression `(unsigned)(i < len)` is _essentially_ the same as `i < len` i.e. the conditional expression where the eager assumption happens.

Now that I think about it, I feel that highlighting this "cast around the conditional expression" case is a bit arbitrary because obviously there _will_ be some differences between the `eagerly-assume=true` and `eagerly-assume=false` analysis modes. For example, IIUC in the similarly awkward `for (int i=0; !(i >= len); i++)` the suppression would activate under `eagerly-assume=false` (because then the loop condition is ambiguous) but it wouldn't activate under `eagerly-assume=true` (because the eager assumption is tied to `i >= len` and not the full loop condition which is `!(i >= len)`).

Based on this I think the best solution would be just deleting both this `IgnoreParenCasts()` call and the testcase with `(unsigned)(i >= len)`.

https://github.com/llvm/llvm-project/pull/109804


More information about the cfe-commits mailing list