[PATCH] D104285: [analyzer][AST] Retrieve value by direct index from list initialization of constant array declaration.

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Aug 16 12:31:25 PDT 2021


aaron.ballman added a comment.

In D104285#2947255 <https://reviews.llvm.org/D104285#2947255>, @ASDenysPetrov wrote:

> In D104285#2943449 <https://reviews.llvm.org/D104285#2943449>, @aaron.ballman wrote:
>
>> One thing I think is worth asking in this thread is whether what you're analyzing is undefined behavior?
>
> Technically you are right. Every exit out of an array extent is UB according to the Standard.

At least in C++; I'd have to double-check for C.

> But in practice we can rely on the fact that multidimensional arrays have a continuous layout in memory on stack.

"But in practice we can rely on <this UB behavior>" is a very dangerous assumption for users to make, and I think we shouldn't codify that in the design of the static analyzer. We might be able to make that guarantee for *clang*, but we can't make that guarantee for all implementations. One of the big uses of a static analyzer is with pointing out UB due to portability concerns.

> Also every compiler treats `int[2][2]` and `int**` differently. E.g.:
>
>   int arr[6][7];
>   arr[2][3]; // *(arr + (2*7 + 3)) = *(arr + 17)
>   
>   int *ptr = arr;
>   ptr[17]; //  *(arr + 17)
>   
>   int **ptr;
>   ptr[2][3] // *(*(ptr + 2) + 3)
>
> Many engineers expoit this fact and treat multidimensional arrays on stack through a raw pointer (`(int*)arr`). We can foresee their intentions and treat a multidimensional array as a single one instead of a warning about UB.

We can do that only if we're convinced that's a sound static analysis (and I'm not convinced). Optimizers can optimize based on the inference that code must be UB free, so I worry that there are optimization situations where the analyzer will fail to warn the user because we're assuming this is safe based purely on memory layout. However, things like TBAA, vectorization, etc may have a different analysis than strictly the memory layout.

>> And when you turn some of these examples into constant expressions, we reject them based on the bounds. e.g., https://godbolt.org/z/nYPcY14a8
>
> Yes, when we use expicit constants there we can catch such a warning, because AST parser can timely recognize the issue. The parser is not smart enough to treat variables. Static Analyzer is in charge of this and executes after the parser.

I'm aware.

> I think AST parser shall also ignore the Standard in this particular case with an eye on a real use cases and developers' intentions.

It would be a bug in Clang to do so; the standard requires a diagnostic if a constant evaluation cannot be performed due to UB: http://eel.is/c++draft/expr.const#5.7

> As you can see there is a bit modified version which doesn't emit the warning https://godbolt.org/z/Mdhhe6Eo9.

Correct; I would not expect Clang to diagnose that because it doesn't require constant evaluation. I was pointing out the `constexpr` diagnostics because it demonstrates that this code has undefined behavior and you're modelling it as though the behavior were concretely defined.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104285/new/

https://reviews.llvm.org/D104285



More information about the cfe-commits mailing list