[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent
DonĂ¡t Nagy via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Aug 28 07:21:22 PDT 2023
donat.nagy added a comment.
In D158707#4621135 <https://reviews.llvm.org/D158707#4621135>, @steakhal wrote:
> Oh, so we would canonicalize towards a signed extent type. I see. I think I'm okay with that.
> However, I think this needs a little bit of polishing and testing when the region does not point to the beginning of an array or object, but rather inside an object, or like this:
>
> int nums[] = {1,2,3};
> char *p = (char*)&nums[1];
>
> clang_analyzer_dumpExtent(p);
> clang_analyzer_dumpElementCount(p);
> ++p;
> clang_analyzer_dumpExtent(p);
> clang_analyzer_dumpElementCount(p);
> ++p;
> int *q = (int*)p;
> clang_analyzer_dumpExtent(q);
> clang_analyzer_dumpElementCount(q);
Unfortunately the analyzer engine does not distinguish pointer arithmetic and element access, and we cannot handle these situations while that limitation exists. For example, if we have an array `int nums[3];`, then the element `nums[1]` and the incremented pointer `int *ptr = nums + 1;` are represented by the same `ElementRegion` object; so we cannot simultaneously say that (1) `nums[1]` is an int-sized memory region, and (2) it's valid to access the elements of the array as `ptr[-1]`, `ptr[0]` and `ptr[1]`.
The least bad heuristic is probably `RegionRawOffsetV2::computeOffset()` from ArrayBoundCheckerV2, which strips away all the `ElementRegion` layers, acting as if they are all coming from pointer arithmetic. This behavior is incorrect if we want to query the extent/elementcount of a "real" ElementRegion (e.g. a slice of a multidimensional array or `(char*)&nums[1]` in your example), but mostly leads to false negatives. On the other hand, if we process a pointer increment as if it were an element access, we can easily run into false positive reports -- this is why I had to abandon D150446 <https://reviews.llvm.org/D150446>.
I'd suggest that we should ignore pointer arithmetic in this commit (or create a testcase that documents the current insufficient behavior) and increment the priority of a through rewrite of the memory model. This is related to the plan https://github.com/llvm/llvm-project/issues/42709 suggested by @NoQ, but perhaps it would be enough to do a less drastic change in that direction.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158707/new/
https://reviews.llvm.org/D158707
More information about the cfe-commits
mailing list