[clang] [analyzer] Switch to PostStmt callbacks in ArrayBoundV2 (PR #72107)

via cfe-commits cfe-commits at lists.llvm.org
Fri Nov 24 02:23:32 PST 2023


DonatNagyE wrote:

> Note that &array[idx] is perfectly valid code when `idx == number of elements`. And it is relatively common to do that when one is using STL algorithms on arrays:
> 
> ```
> auto it = std::find(&array[0], &array[size], foo);
> ```
>
> Of course, one could use the `begin/end` free functions, but those are only available since C++11.

Oh, dammit, this language is _stupid_... I couldn't imagine a reason to use `&array[size]` instead of the cleaner `(array + size)`; but _of course_ STL containers need `&array[size]` with their overloaded `operator[]` (note: this is not an `ArraySubscriptExpr` but an overloaded operator call, so it's not handled by this checker) and then (if you say so) some developers will use this for plain arrays as well...

> Could you elaborate on alternative approaches you considered fixing the problem and why you chose this one? E.g., would trying to look at the parent regions for expressions like `foo[idx].bar` work? Or is the source of the problem that you'd also need the exact expression for the subscript instead of the `MemberExpr`?

I considered two approaches, this one and walking on the parent region layers (I don't think that there is a third distinct approach). I chose this because:
- This way the implementation is easier to understand and more "connected" to the concrete details of the analyzed code.
- A few months ago @steakhal realized that `check::Location` is not sufficient on its own and it should've been supplemented with a `check::Bind` callback to cover all relevant cases. The commit implementing this (https://reviews.llvm.org/D159106) was abandoned, but if we keep `check::Location`, we would need to reintroduce something like that as well.
- I _like_ that I have the exact expression for the subscript (or other dereferencing expression, like `*ptr` or `ptr->foo`) because this puts the warning message to the right source location (e.g. in a convoluted multiline expressions with several `[]` and `->` layers) and allows/could allow better customization of the message. 

> Alternatively, would it be possible to suppress warnings on the common pattern `&array[idx]` by checking the parent of the subscript expression in the AST (but still emitting a warning when the pointer is dereferenced)?

Yes, that was my plan for resolving this issue; and as you say that it's needed, I'll implement it. I'll squeeze the parent lookup into the `if` where we know that there's an overflow, so that (AFAIK expensive) operation won't run in the common case when the array access is in bounds. The "still emitting a warning when the pointer is dereferenced" will happen automatically as at that point we'll have the same overflowing pointer but without the `&` that negates the result. 

I think that I won't suppress the warning in situations like
```
int array[10];
int *f(int arg) {
  if (arg >= 10)
    return &array[arg];
  return array;
}
```
where the analyzer knows that `idx >= size` and doesn't know that `idx == size`. Are there any crazy situations where this is also legitimate?

https://github.com/llvm/llvm-project/pull/72107


More information about the cfe-commits mailing list