[PATCH] D159107: [analyzer] ArrayBoundCheckerV2 should disallow forming lvalues to out-of-bounds locations

DonĂ¡t Nagy via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 31 06:00:08 PDT 2023


donat.nagy added a comment.

In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:

> In D159107#4630573 <https://reviews.llvm.org/D159107#4630573>, @donat.nagy wrote:
>
>> I don't think that the `&arr[N]` issue is too serious: we can just increment the array extent when the parent expression of the array subscript operator is the unary operator `&`. If the past-the-end pointer ends up dereferenced later, the current code is sufficient to report it as a bug (as the checker monitors all dereferences).
>
> My instinct suggests that there is more behind the curtain.

As a rough solution we can simply say that this checker ignores `&arr[idx]` (because it's just a different way of writing `arr + idx`) and only checks expressions where "real" dereference happens. This way the checker would won't emit false positives on past-the-end pointers and still report every out of bound //memory access// (including off-by-one errors).

This can be refined by adding a check that which validates that in an expression like `&arr[idx]` the index satisfies `0 <= idx && idx <= array_size`. This is conceptually independent (but can use the same implementation as the real memory access check) and would add some useful reports and constraints without breaking anything.

In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:

> For example, on an expression `&arr[N]` (of type  `int arr[10]`), we would constrain `N`. We could say that we allow the one before and one after elements, thus introduce a constraint: `N: [-1, 11]`. This `-1` later could participate in casts, and end up interpreted as a really large unsigned number. I should also think about how would we detect off-by-one errors, which is a common category.

Pointer arithmetic is very restricted in the standard, e.g. http://eel.is/c++draft/basic.compound says that

> Every value of pointer type is one of the following:
> (3.1) a pointer to an object or function (the pointer is said to point to the object or function), or
> (3.2) a pointer past the end of an object ([expr.add]), or
> (3.3) the null pointer value for that type, or
> (3.4) an invalid pointer value.

As this explicitly rules out before-first-element pointers and they are not too common (I don't recall any code that used them), I don't think that they deserve a special case (just report them if we see them).

In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:

> The thing is, that I only have this week, aka. 1.5 days before I leave for vacation. :D
> In the future, (no ETA), we likely come back to the Juliet improvements.

Of course, it's completely reasonable to focus on a limited set of changes before the vacation.

In the meantime, I'll probably work on one or more commits that would (1) switch to using the concrete `check::PostStmt` callbacks instead of `check::Location` (and `Bind`) like this commit (2) improve the diagnostics, add more details to the messages. When will you return from the vacation? Should I wait for you with the review of these changes (if I can write and test them before your return)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159107/new/

https://reviews.llvm.org/D159107



More information about the cfe-commits mailing list