[clang] [analyzer] Avoid unnecessary super region invalidation in `CStringChecker` (PR #146212)
DonĂ¡t Nagy via cfe-commits
cfe-commits at lists.llvm.org
Wed Jul 2 07:47:20 PDT 2025
================
@@ -2223,16 +2223,81 @@ void CStringChecker::evalStrcpyCommon(CheckerContext &C, const CallEvent &Call,
Result = lastElement;
}
+ // For bounded method, amountCopied take the minimum of two values,
+ // for ConcatFnKind::strlcat:
+ // amountCopied = min (size - dstLen - 1 , srcLen)
+ // for others:
+ // amountCopied = min (srcLen, size)
+ // So even if we don't know about amountCopied, as long as one of them will
+ // not cause an out-of-bound access, the whole function's operation will not
+ // too, that will avoid invalidating the superRegion of data member in that
+ // situation.
+ bool CouldAccessOutOfBound = true;
+ if (IsBounded && amountCopied.isUnknown()) {
+ // Get the max number of characters to copy.
+ SizeArgExpr lenExpr = {{Call.getArgExpr(2), 2}};
+ SVal lenVal = state->getSVal(lenExpr.Expression, LCtx);
+
+ // Protect against misdeclared strncpy().
+ lenVal =
+ svalBuilder.evalCast(lenVal, sizeTy, lenExpr.Expression->getType());
+
+ std::optional<NonLoc> lenValNL = lenVal.getAs<NonLoc>();
+
+ auto CouldAccessOutOfBoundForSVal = [&](NonLoc Val) -> bool {
+ return !isFirstBufInBound(C, state, C.getSVal(Dst.Expression),
+ Dst.Expression->getType(), Val,
+ C.getASTContext().getSizeType());
+ };
+
+ if (strLengthNL) {
+ CouldAccessOutOfBound = CouldAccessOutOfBoundForSVal(*strLengthNL);
+ }
+
+ if (CouldAccessOutOfBound && lenValNL) {
+ switch (appendK) {
+ case ConcatFnKind::none:
+ case ConcatFnKind::strcat: {
+ CouldAccessOutOfBound = CouldAccessOutOfBoundForSVal(*lenValNL);
+ break;
+ }
+ case ConcatFnKind::strlcat: {
----------------
NagyDonat wrote:
I strongly suspect that calculating `(size - dstLen - 1)` will help only if both `size` and `dstLen` are known concrete integers, because the analyzer cannot properly work with symbolical expressions with this sort of complexity. (E.g. if it knows that `a < b`, it cannot conclude that `a - 1 < b` -- partially because of "not yet implemented" limitations and partially because technically this conclusion isn't true when `a == INT_MIN`.)
I would suggest simplifying the code by using `size` as a rough over-approximation instead of `(size - dstLen - 1)` in this case because that _might_ be simple enough to be useful. (I can imagine that perhaps sometimes the analyzer could conclude that copying at most `size` bytes doesn't cause an overflow, this is more unlikely with the complicated exact upper bound.)
If you want, you can introduce a special case that uses `(size - dstLen - 1)` if the values are known concrete integers -- but I feel that this may be "too special" to deserve custom logic.
https://github.com/llvm/llvm-project/pull/146212
More information about the cfe-commits
mailing list