[PATCH] D159107: [analyzer] ArrayBoundCheckerV2 should disallow forming lvalues to out-of-bounds locations
DonĂ¡t Nagy via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Aug 31 06:00:08 PDT 2023
donat.nagy added a comment.
In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:
> In D159107#4630573 <https://reviews.llvm.org/D159107#4630573>, @donat.nagy wrote:
>
>> I don't think that the `&arr[N]` issue is too serious: we can just increment the array extent when the parent expression of the array subscript operator is the unary operator `&`. If the past-the-end pointer ends up dereferenced later, the current code is sufficient to report it as a bug (as the checker monitors all dereferences).
>
> My instinct suggests that there is more behind the curtain.
As a rough solution we can simply say that this checker ignores `&arr[idx]` (because it's just a different way of writing `arr + idx`) and only checks expressions where "real" dereference happens. This way the checker would won't emit false positives on past-the-end pointers and still report every out of bound //memory access// (including off-by-one errors).
This can be refined by adding a check that which validates that in an expression like `&arr[idx]` the index satisfies `0 <= idx && idx <= array_size`. This is conceptually independent (but can use the same implementation as the real memory access check) and would add some useful reports and constraints without breaking anything.
In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:
> For example, on an expression `&arr[N]` (of type `int arr[10]`), we would constrain `N`. We could say that we allow the one before and one after elements, thus introduce a constraint: `N: [-1, 11]`. This `-1` later could participate in casts, and end up interpreted as a really large unsigned number. I should also think about how would we detect off-by-one errors, which is a common category.
Pointer arithmetic is very restricted in the standard, e.g. http://eel.is/c++draft/basic.compound says that
> Every value of pointer type is one of the following:
> (3.1) a pointer to an object or function (the pointer is said to point to the object or function), or
> (3.2) a pointer past the end of an object ([expr.add]), or
> (3.3) the null pointer value for that type, or
> (3.4) an invalid pointer value.
As this explicitly rules out before-first-element pointers and they are not too common (I don't recall any code that used them), I don't think that they deserve a special case (just report them if we see them).
In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:
> The thing is, that I only have this week, aka. 1.5 days before I leave for vacation. :D
> In the future, (no ETA), we likely come back to the Juliet improvements.
Of course, it's completely reasonable to focus on a limited set of changes before the vacation.
In the meantime, I'll probably work on one or more commits that would (1) switch to using the concrete `check::PostStmt` callbacks instead of `check::Location` (and `Bind`) like this commit (2) improve the diagnostics, add more details to the messages. When will you return from the vacation? Should I wait for you with the review of these changes (if I can write and test them before your return)?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D159107/new/
https://reviews.llvm.org/D159107
More information about the cfe-commits
mailing list