[PATCH] D150446: [analyzer] Check ArraySubscriptExprs in ArrayBoundCheckerV2

Donát Nagy via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri May 26 11:23:52 PDT 2023


donat.nagy added a comment.

@NoQ **Re: ElementRegion hacks:** Your suggestion is very convincing, and I agree that while my idea would bring some clarity to some particular issues, overall it would just worsen the "chaotic heap of classes" situation. The only advantage of my solution is that it would be an incremental change; but it's only incrementing the amount of problems, so I think I won't continue working on it.

On the other hand, I'd be interested in participating in the implementation of your suggestion. Unfortunately I don't have enough capacity to just sit down and start coding it; but I could suggest this reorganization as a potential thesis topic for a student/intern. (Our team regularly gets interns from Eötvös Loránd University; if one of them is interested, then perhaps we could tackle this issue with me in a mentor / advisor role, and the student creating the bulk of the code reorganization commits. These are very far-fetched plans, but I'd be glad to see the cleanup that you suggested and I'll try to contribute what I can. Of course if you have a more concrete plan for doing this rewrite, then instead of intruding I'm happy to help with e.g. reviews.)

**Re: this review:**

> One straightforward implication is that this way [checking PreStmt<ArraySubscriptExpr>] we avoid dealing with multiplication/division when calculating offsets.

Unfortunately this is not the case, because the size of the arrays (calculated by `clang::ento::getDynamicExtent()` in DynamicExtent.cpp <https://clang.llvm.org/doxygen//DynamicExtent_8cpp_source.html>) is still expressed in bytes ("CharUnits") and we need to multiply the index with `sizeof(elemType)` before we can compare it to this value. The distinguishing feature of ArrayBoundCheckerV2 is that it has some (ad-hoc but useful) logic for reasoning about some multiplications; that part is not changed by this commit.

Now that you mention it, I see that it would be //nice// to avoid bothering with byte offsets and multiplications in the simple case when we want to take a single `int` from an array of `int`s. To achieve this we'd need to introduce extent handling functions that measure the size of arrays in number of elements (of what kind?) instead of bytes; right now I don't know how difficult would it be to achieve that. (As far as I see the extent is usually either a constant [that can be readily divided], or it's a freshly conjured symbol... Are those symbols used for anything meaningful? Can we ever put an upper bound on them?) However this all should probably belong to a separate commit/review/discourse thread.

**Re: PreStmt<ForStmt>:** That's a very good idea, but I feel that it belongs to the "engine level" and it's mostly orthogonal to the behavior this checker. My short-term (or at most medium-term) goal is to move this checker out of alpha (fix bugs, squash common false positives, improve messages etc.) and here I accept the limitation that (like all the currently "stable" checkers) it's mostly limited to loop-less code. On a longer term it'd be interesting to work on improving the loop handling engine (or just adding a gimmicky "loop prediction" ability to this checker); but I don't have concrete plans that far ahead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150446/new/

https://reviews.llvm.org/D150446



More information about the cfe-commits mailing list