[PATCH] D148355: [analyzer] Fix comparison logic in ArrayBoundCheckerV2
DonĂ¡t Nagy via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Apr 24 05:40:57 PDT 2023
donat.nagy marked an inline comment as done.
donat.nagy added inline comments.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:173
+ const MemSpaceRegion *SR = rawOffset.getRegion()->getMemorySpace();
+ if (SR->getKind() != MemRegion::UnknownSpaceRegionKind) {
+ // a pointer to UnknownSpaceRegionKind may point to the middle of
----------------
steakhal wrote:
> donat.nagy wrote:
> > steakhal wrote:
> > >
> > You're completely right, I just blindly copied this test from the needlessly overcomplicated `computeExtentBegin()`.
> Hold on. This would only skip the lower bounds check if it's an `UnknownSpaceRegion`.
> Shouldn't we early return instead?
This behavior is inherited from the code before my commit: the old block `if ( /*... =*/ extentBegin.getAs<NonLoc>() ) { /* ... */ }` is equivalent to `if (llvm::isa<UnknownSpaceRegion>(SR)) { /*...*/ }` and there was no early return connected to //this// NonLocness check. (The old code skipped the upper bound check if the result of `evalBinOpNN()` is unknown, and that's what I changed because I saw no reason to do an early return there.)
After some research into the memory region model, I think that there is no reason to perform an early return -- in fact, the condition of this `if` seems to be too narrow because we would like to warn about code like
struct foo {
int tag;
int array[5];
};
int f(struct foo *p) {
return p->arr[-1];
}
despite the fact that it's indexing into a `FieldRegion` inside a `SymbolicRegion` in `UnknownSpaceRegion`. That is, instead of checking the top-level MemorySpace, the correct logic would be checking the kind of the memory region and/or perhaps its immediate super-region.
As this is a complex topic and completely unrelated to the main goal of this commit; I'd prefer to keep the old (not ideal, but working) logic in this patch, then revisit this question by creating a separate follow-up commit.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:173
+ const MemSpaceRegion *SR = rawOffset.getRegion()->getMemorySpace();
+ if (SR->getKind() != MemRegion::UnknownSpaceRegionKind) {
+ // a pointer to UnknownSpaceRegionKind may point to the middle of
----------------
donat.nagy wrote:
> steakhal wrote:
> > donat.nagy wrote:
> > > steakhal wrote:
> > > >
> > > You're completely right, I just blindly copied this test from the needlessly overcomplicated `computeExtentBegin()`.
> > Hold on. This would only skip the lower bounds check if it's an `UnknownSpaceRegion`.
> > Shouldn't we early return instead?
> This behavior is inherited from the code before my commit: the old block `if ( /*... =*/ extentBegin.getAs<NonLoc>() ) { /* ... */ }` is equivalent to `if (llvm::isa<UnknownSpaceRegion>(SR)) { /*...*/ }` and there was no early return connected to //this// NonLocness check. (The old code skipped the upper bound check if the result of `evalBinOpNN()` is unknown, and that's what I changed because I saw no reason to do an early return there.)
>
> After some research into the memory region model, I think that there is no reason to perform an early return -- in fact, the condition of this `if` seems to be too narrow because we would like to warn about code like
> struct foo {
> int tag;
> int array[5];
> };
> int f(struct foo *p) {
> return p->arr[-1];
> }
> despite the fact that it's indexing into a `FieldRegion` inside a `SymbolicRegion` in `UnknownSpaceRegion`. That is, instead of checking the top-level MemorySpace, the correct logic would be checking the kind of the memory region and/or perhaps its immediate super-region.
>
> As this is a complex topic and completely unrelated to the main goal of this commit; I'd prefer to keep the old (not ideal, but working) logic in this patch, then revisit this question by creating a separate follow-up commit.
Minor nitpick: your suggested change accidentally negated the conditional :) ... and I said that it's "completely right". I'm glad that I noticed this and inserted the "!" before the `isa` check because otherwise it could've been annoying to debug this...
================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:174-175
+ if (SR->getKind() != MemRegion::UnknownSpaceRegionKind) {
+ // a pointer to UnknownSpaceRegionKind may point to the middle of
+ // an allocated region
----------------
steakhal wrote:
> Good point.
For now I'm keeping this comment (with your formatting changes), because it's "approximately correct", but I'll replace or elaborate it when I refine the condition for skipping the lower bound check.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:196-198
+ ProgramStateRef state_withinUpperBound, state_exceedsUpperBound;
+ std::tie(state_withinUpperBound, state_exceedsUpperBound) =
+ compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder);
----------------
steakhal wrote:
> I think as we don't plan to overwrite/assign to these states, we could just use structured bindings.
> I think that should be preferred over `std::tie()`-ing think. That is only not widespread because we just recently moved to C++17.
Good suggestion, applied both here and at the lower bound check.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:212-215
}
- assert(state_withinUpperBound);
- state = state_withinUpperBound;
+ if (state_withinUpperBound)
+ state = state_withinUpperBound;
----------------
steakhal wrote:
> donat.nagy wrote:
> > steakhal wrote:
> > > You just left the guarded block in the previous line.
> > > By moving this statement there you could spare the `if`.
> > Nice catch :)
> On second though no. The outer if guards `state_exceedsUpperBound`.
> So this check seems necessary.
Yup, kept it. You had me at first glance... ;)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D148355/new/
https://reviews.llvm.org/D148355
More information about the cfe-commits
mailing list