[PATCH] D159107: [analyzer] ArrayBoundCheckerV2 should disallow forming lvalues to out-of-bounds locations

Balázs Benics via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 31 06:24:34 PDT 2023


steakhal added a comment.

In D159107#4631069 <https://reviews.llvm.org/D159107#4631069>, @donat.nagy wrote:

> In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:
>
>> In D159107#4630573 <https://reviews.llvm.org/D159107#4630573>, @donat.nagy wrote:
>>
>>> I don't think that the `&arr[N]` issue is too serious: we can just increment the array extent when the parent expression of the array subscript operator is the unary operator `&`. If the past-the-end pointer ends up dereferenced later, the current code is sufficient to report it as a bug (as the checker monitors all dereferences).
>>
>> My instinct suggests that there is more behind the curtain.
>
> As a rough solution we can simply say that this checker ignores `&arr[idx]` (because it's just a different way of writing `arr + idx`) and only checks expressions where "real" dereference happens. This way the checker would won't emit false positives on past-the-end pointers and still report every out of bound //memory access// (including off-by-one errors).
>
> This can be refined by adding a check that which validates that in an expression like `&arr[idx]` the index satisfies `0 <= idx && idx <= array_size`. This is conceptually independent (but can use the same implementation as the real memory access check) and would add some useful reports and constraints without breaking anything.

It makes sense, but it would need to special case on the parent or child AST node, which isn't really clean. But I think we think the same on this subject.
(Note that the `&` is a parent of the `arr[N]`, where the checker would trigger, thus we would need to check the parent node to ignore such cases; and that is not really supported in the AST, and using the ParentMap for it is expensive. Remember, that this would happen for any SubscriptExpr as many times they are visited.)

> In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:
>
>> For example, on an expression `&arr[N]` (of type  `int arr[10]`), we would constrain `N`. We could say that we allow the one before and one after elements, thus introduce a constraint: `N: [-1, 11]`. This `-1` later could participate in casts, and end up interpreted as a really large unsigned number. I should also think about how would we detect off-by-one errors, which is a common category.
>
> Pointer arithmetic is very restricted in the standard, e.g. http://eel.is/c++draft/basic.compound says that
>
>> Every value of pointer type is one of the following:
>> (3.1) a pointer to an object or function (the pointer is said to point to the object or function), or
>> (3.2) a pointer past the end of an object ([expr.add]), or
>> (3.3) the null pointer value for that type, or
>> (3.4) an invalid pointer value.
>
> As this explicitly rules out before-first-element pointers and they are not too common (I don't recall any code that used them), I don't think that they deserve a special case (just report them if we see them).

Yes. They are really restricted, and AFAIK only past-the-end pointers are allowed, unlike before-the-first element. Such code might be written for reverse iterations.
But we know that hardware would work like that, and people might rely on it.

> In D159107#4630764 <https://reviews.llvm.org/D159107#4630764>, @steakhal wrote:
>
>> The thing is, that I only have this week, aka. 1.5 days before I leave for vacation. :D
>> In the future, (no ETA), we likely come back to the Juliet improvements.
>
> Of course, it's completely reasonable to focus on a limited set of changes before the vacation.
>
> In the meantime, I'll probably work on one or more commits that would (1) switch to using the concrete `check::PostStmt` callbacks instead of `check::Location` (and `Bind`) like this commit (2) improve the diagnostics, add more details to the messages. When will you return from the vacation? Should I wait for you with the review of these changes (if I can write and test them before your return)?

I don't think you should wait for me. I'll be off for a week though, but I might not have the bandwidth to do reviews even after that.
Post it once you are fine with it. Prove it "behaves well", and just wait for someone to review it. It might be noq, xazax or me. We will see.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159107/new/

https://reviews.llvm.org/D159107



More information about the cfe-commits mailing list