[PATCH] D148355: [analyzer] Fix comparison logic in ArrayBoundCheckerV2

DonĂ¡t Nagy via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Apr 24 05:40:57 PDT 2023


donat.nagy marked an inline comment as done.
donat.nagy added inline comments.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:173
+  const MemSpaceRegion *SR = rawOffset.getRegion()->getMemorySpace();
+  if (SR->getKind() != MemRegion::UnknownSpaceRegionKind) {
+    // a pointer to UnknownSpaceRegionKind may point to the middle of
----------------
steakhal wrote:
> donat.nagy wrote:
> > steakhal wrote:
> > > 
> > You're completely right, I just blindly copied this test from the needlessly overcomplicated `computeExtentBegin()`.
> Hold on. This would only skip the lower bounds check if it's an `UnknownSpaceRegion`.
> Shouldn't we early return instead?
This behavior is inherited from the code before my commit: the old block `if ( /*... =*/ extentBegin.getAs<NonLoc>() ) { /* ... */ }` is equivalent to `if (llvm::isa<UnknownSpaceRegion>(SR)) { /*...*/ }` and there was no early return connected to //this// NonLocness check. (The old code skipped the upper bound check if the result of `evalBinOpNN()` is unknown, and that's what I changed because I saw no reason to do an early return there.)

After some research into the memory region model, I think that there is no reason to perform an early return -- in fact, the condition of this  `if` seems to be too narrow because we would like to warn about code like
  struct foo {
    int tag;
    int array[5];
  };
  int f(struct foo *p) {
    return p->arr[-1];
  }
despite the fact that it's indexing into a `FieldRegion` inside a `SymbolicRegion` in `UnknownSpaceRegion`. That is, instead of checking the top-level MemorySpace, the correct logic would be checking the kind of the memory region and/or perhaps its immediate super-region.

As this is a complex topic and completely unrelated to the main goal of this commit; I'd prefer to keep the old (not ideal, but working) logic in this patch, then revisit this question by creating a separate follow-up commit.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:173
+  const MemSpaceRegion *SR = rawOffset.getRegion()->getMemorySpace();
+  if (SR->getKind() != MemRegion::UnknownSpaceRegionKind) {
+    // a pointer to UnknownSpaceRegionKind may point to the middle of
----------------
donat.nagy wrote:
> steakhal wrote:
> > donat.nagy wrote:
> > > steakhal wrote:
> > > > 
> > > You're completely right, I just blindly copied this test from the needlessly overcomplicated `computeExtentBegin()`.
> > Hold on. This would only skip the lower bounds check if it's an `UnknownSpaceRegion`.
> > Shouldn't we early return instead?
> This behavior is inherited from the code before my commit: the old block `if ( /*... =*/ extentBegin.getAs<NonLoc>() ) { /* ... */ }` is equivalent to `if (llvm::isa<UnknownSpaceRegion>(SR)) { /*...*/ }` and there was no early return connected to //this// NonLocness check. (The old code skipped the upper bound check if the result of `evalBinOpNN()` is unknown, and that's what I changed because I saw no reason to do an early return there.)
> 
> After some research into the memory region model, I think that there is no reason to perform an early return -- in fact, the condition of this  `if` seems to be too narrow because we would like to warn about code like
>   struct foo {
>     int tag;
>     int array[5];
>   };
>   int f(struct foo *p) {
>     return p->arr[-1];
>   }
> despite the fact that it's indexing into a `FieldRegion` inside a `SymbolicRegion` in `UnknownSpaceRegion`. That is, instead of checking the top-level MemorySpace, the correct logic would be checking the kind of the memory region and/or perhaps its immediate super-region.
> 
> As this is a complex topic and completely unrelated to the main goal of this commit; I'd prefer to keep the old (not ideal, but working) logic in this patch, then revisit this question by creating a separate follow-up commit.
Minor nitpick: your suggested change accidentally negated the conditional :) ... and I said that it's "completely right". I'm glad that I noticed this and inserted the "!" before the `isa` check because otherwise it could've been annoying to debug this...


================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:174-175
+  if (SR->getKind() != MemRegion::UnknownSpaceRegionKind) {
+    // a pointer to UnknownSpaceRegionKind may point to the middle of
+    // an allocated region
 
----------------
steakhal wrote:
> Good point.
For now I'm keeping this comment (with your formatting changes), because it's "approximately correct", but I'll replace or elaborate it when I refine the condition for skipping the lower bound check.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:196-198
+    ProgramStateRef state_withinUpperBound, state_exceedsUpperBound;
+    std::tie(state_withinUpperBound, state_exceedsUpperBound) =
+        compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder);
----------------
steakhal wrote:
> I think as we don't plan to overwrite/assign to these states, we could just use structured bindings.
> I think that should be preferred over `std::tie()`-ing think. That is only not widespread because we just recently moved to C++17.
Good suggestion, applied both here and at the lower bound check.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:212-215
     }
 
-    assert(state_withinUpperBound);
-    state = state_withinUpperBound;
+    if (state_withinUpperBound)
+      state = state_withinUpperBound;
----------------
steakhal wrote:
> donat.nagy wrote:
> > steakhal wrote:
> > > You just left the guarded block in the previous line.
> > > By moving this statement there you could spare the `if`.
> > Nice catch :)
> On second though no. The outer if guards `state_exceedsUpperBound`.
> So this check seems necessary.
Yup, kept it. You had me at first glance... ;)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148355/new/

https://reviews.llvm.org/D148355



More information about the cfe-commits mailing list