[PATCH] D152435: [analyzer][CStringChecker] Adjust the invalidation operation on the super region of the destination buffer during string copy

Balázs Benics via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jun 13 04:09:45 PDT 2023


steakhal added a comment.

Thanks for the elaborated answer on the GH issue. Now it makes sense. And the fix is align with the observations, which is good.
I only have concerns about the code quality of the modifier code. It was already in pretty bad shape, and I cannot say we improve it with this patch.
However, I can also see that it might require a bit engineering to refactor it into something cleaner, so I won't object if you push back.

Thanks for the patch!



================
Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:263-280
+  enum class InvalidationKind {
+    // Invalidate the source buffer for escaping pointers.
+    IK_SrcInvalidation,
+
+    // Invalidate the destination buffer determined by characters copied.
+    IK_DstInvalidationBySize,
+
----------------
I don't think this is the cleanest way to achieve this.
To me, it feels like it might make sense to have distinct invalidation functions for each scenario and mentally share the same overload-set.
Right now, the API looks really complicated: using enum values along with default parameters. Not to speak of how many parameters it accepts.
However, I won't object this time given that it was already pretty bad, and unreadable - so in that sense, it's not much worse this way and it was not the point to improve the quality of the code.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:992
 
-  QualType sizeTy = Size->getType();
+  QualType sizeTy = Size.getType(C.getASTContext());
   QualType PtrTy = Ctx.getPointerType(Ctx.CharTy);
----------------
The type of the `SVal` might not be always accurate.
in fact, we try to move away from using that in the future - whenever possible.
Consequently, I'd prefer passing down the type as an additional parameter instead, or just keeping the `Expr` along with the corresponding `SVal`.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:1071-1088
+    if (InvalidationKind::IK_SrcInvalidation == Kind) {
       ITraits.setTrait(R->getBaseRegion(),
                        RegionAndSymbolInvalidationTraits::TK_PreserveContents);
       ITraits.setTrait(R, RegionAndSymbolInvalidationTraits::TK_SuppressEscape);
       CausesPointerEscape = true;
-    } else {
-      const MemRegion::Kind& K = R->getKind();
-      if (K == MemRegion::FieldRegionKind)
-        if (Size && IsFirstBufInBound(C, state, E, Size)) {
-          // If destination buffer is a field region and access is in bound,
-          // do not invalidate its super region.
-          ITraits.setTrait(
-              R,
-              RegionAndSymbolInvalidationTraits::TK_DoNotInvalidateSuperRegion);
-        }
+    } else if (MemRegion::FieldRegionKind == R->getKind()) {
+      if (InvalidationKind::IK_DstAlwaysEscapeSuperRegion == Kind) {
----------------
Again, it's not your fault, but this code just looks insane.
I'm not surprised it had a bug, and likely it still has.


================
Comment at: clang/test/Analysis/issue-55019.c:3
+
+// RUN: %clang_analyze_cc1 %s -verify -Wno-strict-prototypes \
+// RUN:   -analyzer-checker=core \
----------------
Why do you need the `Wno-strict-prototypes`? Can we make it work with it?


================
Comment at: clang/test/Analysis/issue-55019.c:8
+
+#include "Inputs/system-header-simulator.h"
+void *malloc(size_t);
----------------
~~I guess this include is only for the `size_t`, right?
In c++, you could use `using size_t = decltype(sizeof(int));` to define it.~~

Ah, I see that it's for c function declarations. If that's the case, have you considered adding the `malloc` and `free` declarations to that header?


================
Comment at: clang/test/Analysis/issue-55019.c:26
+  clang_analyzer_dump(x.ptr); // expected-warning {{HeapSymRegion}}
+  memset(x.arr, 0, SIZE);
+  clang_analyzer_dump(x.ptr); // expected-warning {{HeapSymRegion}}
----------------
I believe you can use the `sizeof(x.arr)` instead.


================
Comment at: clang/test/Analysis/issue-55019.c:28
+  clang_analyzer_dump(x.ptr); // expected-warning {{HeapSymRegion}}
+  free(x.ptr);                // no-warning
+}
----------------
I would rather use `no-leak-warning` here to be more specific about what we don't expect.


================
Comment at: clang/test/Analysis/issue-55019.cpp:1-4
+// Refer issue 55019 for more details.
+
+// RUN: %clang_analyze_cc1 %s -verify \
+// RUN:   -analyzer-checker=core,debug.ExprInspection
----------------
I think you can just append the C file to this one.
This shouldn't make a difference. And if it does, you can still guard that test code using macro `ifndef` trickery in addition to distinct `-verify=xxx` prefixes.


================
Comment at: clang/test/Analysis/issue-55019.cpp:24-26
+  // FIXME: As we cannot know whether the copy overflows, we will invalidate the
+  // entire object. Modify the verify direction from SymRegion to HeapSymRegion
+  // when the size if modeled in CStringChecker.
----------------
So, you say that `It should be HeapSymRegion instead` because `x.arr` shouldn't be invalidated because the std::copy` won't modify it. Right?


================
Comment at: clang/test/Analysis/pr22954.c:560
   clang_analyzer_eval(x263.s1[0] == 0); // expected-warning{{UNKNOWN}}
+  // expected-warning at -1{{Potential leak of memory pointed to by 'x263.s2'}}
   clang_analyzer_eval(x263.s1[1] == 0); // expected-warning{{UNKNOWN}}
----------------
Ah, the leak should have been reported at the end of the full-expression of `memcpy`.
If we have this expect line here it could mislead the reader that it has anything to do with the `eval` call before.

In fact, any statement after `memcpy` would have this diagnostic.
I have seen cases where we have a `next_line()` opaque call after such places to anchor the leak diagnostic.
I think we can do the same here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152435/new/

https://reviews.llvm.org/D152435



More information about the cfe-commits mailing list