[clang] [clang] add array out-of-bounds access constraints using llvm.assume (PR #159046)

Michael Kruse via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 1 16:02:28 PDT 2025


https://github.com/Meinersbur commented:

There have been past discussions about whether it is legal to explot this in C/C++. See e.g.
 * https://discourse.llvm.org/t/delinearization-validity-checks-in-dependenceanalysis/52000/10
 * https://discourse.llvm.org/t/rfc-adding-range-metadata-to-array-subscripts/57912
 * https://reviews.llvm.org/D114988

 In short, not everybody agreed that this is allowed in every version of C or C++. At least what I don't see in this PR that it is legal to build a pointer for one-past-the-last element of an array. So for `float A[10]`, `A+10` is valid, but must not be indirected. The assume is only emitted for the subscript operator which technically is syntactic sugar that includes the indirection. Howver, in practice programmers will use `&A[10]` to create apointer to the one-past-the end, e.g.:
```c
float A[10];
n = 10;
...
for (float *p = &A[0]; p < &A[n]; ++p) { ... }
if (n != 10) abort();
```
We should be very careful to not miscompile this because we added an `assume(n < 10)`.

There are also codes around that assume a flattened layout of multi-dimensional arrays. For instance:
```c
float A[10][10];
(&A[0][0])[99]; // or more bluntly: `A[0][99]`
```
since technically, `&A[0][0]` is a pointer to the first subobject of the outer array which is an array of 10 elements. 

I would be carful exploiting this kind of information, possibly protect is with a compiler switch in the tradition of `-fstrict-aliasing`.

https://github.com/llvm/llvm-project/pull/159046


More information about the cfe-commits mailing list