[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

Donát Nagy via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Aug 28 04:30:27 PDT 2023


donat.nagy accepted this revision.
donat.nagy added a comment.
This revision is now accepted and ready to land.

Thanks for the updates! I didn't spot any important issue, so I'm accepting this commit, but let's wait for the opinion of @steakhal before merging this.

In D158707#4613955 <https://reviews.llvm.org/D158707#4613955>, @danix800 wrote:

> In D158707#4613743 <https://reviews.llvm.org/D158707#4613743>, @donat.nagy wrote:
>
>> [...] Does this commit fix the "root" of the issue by eliminating some misuse of correctly implemented (but perhaps surprising) `SVal` / `APSInt` arithmetic; or is there an underlying bug in the `SVal` / `APSInt` arithmetic which is just avoided (and not eliminated) by this commit? In the latter case, what obstacles prevent us from fixing the root cause?
>
> For the observed signed compared to unsigned unexpectation from ArrayBoundV2,
> the constraint manager does give the CORRECT result.
>
> It's a misuse of the constraint API, mainly caused by unexpected unsigned extent
> set by memory modeling. Actually `ArrayBoundV2::getSimplifiedOffsets()` has clear
> comment that this `signed <-> unsigned` comparison is problematic.
>
>   // TODO: once the constraint manager is smart enough to handle non simplified
>   // symbolic expressions remove this function. Note that this can not be used in
>   // the constraint manager as is, since this does not handle overflows. It is
>   // safe to assume, however, that memory offsets will not overflow.
>   // NOTE: callers of this function need to be aware of the effects of overflows
>   // and signed<->unsigned conversions!
>   static std::pair<NonLoc, nonloc::ConcreteInt>
>   getSimplifiedOffsets(NonLoc offset, nonloc::ConcreteInt extent,
>                        SValBuilder &svalBuilder) {

In fact, the "NOTE:" comment was added by my patch D135375 <https://reviews.llvm.org/D135375> (which eliminated lots of false positives caused by a situation when the argument `offset` was unsigned), but I was still confused by this new bug. (Where the other argument `extent` was unsigned and this led to incorrect conclusions like "`len` cannot be larger than `3u`, so `len` cannot be `-1`, because `-1` is larger than `3u`" 🙃 .) Thanks for troubleshooting this issue!



================
Comment at: clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp:34
+    if (auto SSize =
+            SVB.convertToArrayIndex(*Size).getAs<DefinedOrUnknownSVal>())
+      return *SSize;
----------------
I think it's a good convention if `getDynamicExtent()` will always return concrete values as `ArrayIndexTy`. (If I didn't miss some very obscure case, then this will be true when this commit is merged.) 


================
Comment at: clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp:49
+    return UnknownVal();
+  MR = MR->StripCasts();
+
----------------
Note that `StripCasts()` is overzealous and strips array indexing when the index is zero. For example if you have a rectangular matrix `int m[6][8];` then this function would (correctly) say that the element count of `m[1]` is 8, but it would claim that the element count of `m[0]` is 6 (because it strips the 'cast' and calculates the element count of `m`).

This is caused by the hacky "solution" that casts are represented by `ElementRegion{original region, 0, new type}` which cannot be distinguished from a real element access where the index happens to be 0. (This is just a public service announcement, this already affects e.g. `getDynamicExtent` and I don't think that you can find a local solution. For more information, see also https://github.com/llvm/llvm-project/issues/42709 which plans to refactor the memory model.)


================
Comment at: clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp:72-75
+  auto ElementCount =
+      SVB.evalBinOp(State, BO_Div, Size, ElementSize, SVB.getArrayIndexType())
+          .getAs<DefinedOrUnknownSVal>();
+  return ElementCount ? *ElementCount : UnknownVal();
----------------
Perhaps use the method `value_or` of std::optional?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707



More information about the cfe-commits mailing list