[PATCH] D70411: [analyzer] CERT STR rule checkers: STR31-C

Balogh, Ádám via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 29 08:03:43 PDT 2020


baloghadamsoftware requested changes to this revision.
baloghadamsoftware added inline comments.
This revision now requires changes to proceed.
Herald added a subscriber: rnkovacs.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:55
+  // they can cause a not null-terminated string. In this case we store the
+  // string is being not null-terminated in the 'NullTerminationMap'.
+  //
----------------
Charusso wrote:
> balazske wrote:
> > The functions `gets`, `strcpy` return a null-terminated string according to standard, probably `sprintf` too, and the `fscanf` `%s` output is null-terminated too even if no length is specified (maybe it writes outside of the specified buffer but null terminated).
> From where do you observe such information? I have not seen anything `strcpy()`-specific in the standard. My observation clearly says that, if there is not enough space for the data you *could* even write to overlapping memory, or as people says, you could meet nasal demons. It is called *undefined* behavior.
>From `cppreference.com`:

`strcpy()` -- //Copies the null-terminated byte string pointed to by src, including the null terminator, to the character array whose first element is pointed to by dest.//

`gets()` -- //https://en.cppreference.com/w/c/io/gets//

`scanf`, `fscanf`, `sscan` format `%s` -- //If width specifier is used, matches up to width or until the first whitespace character, whichever appears first. Always stores a null character in addition to the characters matched (so the argument array must have room for at least width+1 characters)//

Thus @balazske is completely right. All these functions add the null-terminator. Of course, if the string does not fit, it overwrites the memory after the buffer which is undefined behavior. But the null-terminator is definitely added.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:100
+/// Returns the interesting region of \p V.
+static const MemRegion *getRegion(SVal V) {
+  const MemRegion *MR = V.getAsRegion();
----------------
Do we really need a separate utility function for that?


================
Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:136
+  llvm::raw_svector_ostream Out(Msg);
+  Out << '\'' << CallName << "' could write outside of '" << printPretty(Arg, C)
+      << '\'';
----------------
Create `SmallString`, use a stream to print into it, create another one in `printPretty`, convert it to `std::string` and then copy it again to the original `SmallString`? Why?


================
Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:165
+  if (EnableStr31cChecker)
+    createCallOverflowReport(Call, CallC, C);
+}
----------------
`sprintf` not non-compliant generally. You should calculate the possible maximal length of the string and compare it to the size of the target region.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:180
+  // FIXME: Handle multiple buffers.
+  if (FormatExpr->getString() != "%s")
+    return;
----------------
Checking for `%s` is not enough. `%2s` may also be a problem if the buffer is `1` character long.


================
Comment at: clang/test/Analysis/cert/str31-c.cpp:27
+namespace test_gets_good {
+enum { BUFFERSIZE = 32 };
+
----------------
Why do we need a different buffer size here? And why `enum`?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70411/new/

https://reviews.llvm.org/D70411





More information about the cfe-commits mailing list