[PATCH] D70411: [analyzer] CERT: STR31-C

Wed Mar 25 12:59:38 PDT 2020

Charusso marked 6 inline comments as done.
Charusso added a comment.

"To prevent such errors, either limit copies through truncation or, preferably, ensure that the destination is of sufficient size to hold the character data" - from the rule's page.
Most of the projects are fine truncating by hand because the write happens in somewhat well-bounded strings: IP-addresses, names, numbers... I wanted to make this as practical as possible. Until you are having a null-terminated string without being read, you are most likely fine. Feel free to try this out, probably you would already understand the `WarnOnCall` option very well.

================
Comment at: clang/docs/analyzer/checkers.rst:1935
+
+alpha.security.cert.str.31c
+"""""""""""""""""""""""""""
----------------
balazske wrote:
> Charusso wrote:
> > Szelethus wrote:
> > > balazske wrote:
> > > > There are already more checkers that can check for CERT related problems but not specially made for these. These checkers do not reside in this new `cert` group. And generally a checker does not check for specifically a CERT rule, instead for more of them or other things too, or more checkers can detect a single rule. (And the user can think that only these CERT rules are checkable that exist in this package, that is not true.) So I do not like the introduction of this new `cert` package. (The documentation of existing checkers lists if the checker is designed for a CERT rule.)
> > > I disagree to some extent. I think it would be great to have a `cert` package that houses all checkers for each of the rules with the addition of checker aliases. Clang-tidy has something similar as well!
> > I designed the checker only for the rule STR31-C that is why I have picked package `cert`. Clang-Tidy could aliasing checks. For example the check could be in the `bugprone` category aliased to `cert` and both could trigger the analysis.
> > 
> > > the user can think that only these CERT rules are checkable
> > We only need to move `security.FloatLoopCounter` under the `cert` package. What else SEI CERT rules are finished off other than package `cert`? It is not my fault if someone could not package well or could not catch most of the issues of a SEI CERT rule or could not reach the SEI CERT group to note the fact the Analyzer catch a very tiny part of a rule. However, this patch package well and could catch most of the STR31-C rule.
> So I can move this checker: D71510 into `alpha.security.cert.err.33c`?
Well, not exactly. No one should care about alpha checkers except advanced Static Analyzer developers like you do right now. Since alpha is a dead-zone you can put your work anywhere. The core issue was Ted's placing of CERT rules.

I have written the first CERT-checker after Ted's, which introduced such packaging, then it became dead, so I have lost interest to make the package-aliasing a thing. If you wish to move to `cert`, feel free, but please note that, your first idea of bad-packaging is right. We really need the functionality of package-aliasing but I have ran out of time and the project died as well.

================
Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:803
+
+} // end "cert.str"
+
----------------
balazske wrote:
> Charusso wrote:
> > balazske wrote:
> > > NoQ wrote:
> > > > `alpha.cert.str`.
> > > This text may be hard to understand. The option means "if it is off, the problem report happens only if there is no hand-written null termination of the string"? I looked at the CERT rule but did not find out why is null termination by hand important here. (I did not look at the code to find out what it does.) And "WarnOnCall" suggests that there is warning made on calls if on, or not at all or at other location if off, is this correct?
> > You let the buffer overflow by a bugprone function call, then you adjust the null-terminator by simply `buf[size] = '\0'`, then you made sure the buffer cannot overflow, since you have terminated it. It is a very common idiom therefore I do not think a hand-written null-termination is a security issue. The SEI CERT rules are all theoretical so you will not find anything useful in practice. My solution is practical.
> > 
> > > This text may be hard to understand.
> > Please note that this text only made for Static Analyzer developers. Let us rephrase it then:
> > 
> > > Whether the checker needs to warn on the bugprone function calls immediately or look for bugprone hand-written null-termination of bugprone function call made strings. It is a common idiom to null-terminate the string by hand after the insecure function call produce the string which could be misused so that it is on by default. It is useful to turn it off to reduce the noise of the checker, because people usually null-terminate the string by hand immediately after the bugprone function call produce the string.
> > 
> > Do you buy that?
> First sentence is OK, the later part relatively better. Probably the explanations about why it is good to turn it on or off can be left out, these are misunderstandable. (If the manual null-termination is done, in one case it is a "common idiom but can be misused so make warning" on other case "these warnings generates only extra noise".)
> It can be something like this (if correct):
> ```
> Whether the checker wars on the function calls immediately or warns on bugprone hand-written null-termination of function call made strings. Default value is on because null-terminating the string by hand after the insecure function call could be misused. But it is a common idiom to manually null-terminate the strings immediately after the function calls so it could be turned off to reduce noise of the checker.
> ```
> I do not like the word "bugprone", is it correct english? ("Problematic" or "insecure" could be used instead.)
`bugprone` comes from the Clang-Tidy world and we use it everywhere, but let us change it to `insecure` then. I will copy-paste it. Thanks!

================
Comment at: clang/test/Analysis/cert/str31-c-notes-warn-on-call-off.cpp:68
+
+  do_something(dest);
+  // expected-warning at -1 {{'dest' is not null-terminated}}
----------------
balazske wrote:
> Charusso wrote:
> > balazske wrote:
> > > Maybe I have wrong expectations about what this checker does but did understand it correctly yet. The main problem here is that `dest` may be not large enough (if size of `src` is not known). This would be a better text for the error message? And if this happens (the write outside) it is already a fatal error, can cause crash. If no buffer overflow happens the terminating zero is copied from `src`, so why would `dst` be not null terminated?
> > I like that error message because that is the truth, that is the issue. If it would crash so the operating system would detect a fatal error we should not develop Static Analyzer and vulnerabilities would not exist. That would be cool, btw.
> There must be some misunderstanding here, I still do not know what this test does. Can probably somebody other explain it? After the first `strcpy` the `src` content is copied into `dest`. If `src` is not a null-terminated string this is itself a bug and `strcpy` is undefined behavior. If `src` has `strlen(src) > 127` it is a bug again because the copy can write outside of `dest` past its end, into the stack frame of the function. After this happens, writing a null terminator does not correct the other overwritten data that still can cause crash or other thing. (This `strcpy` call is the place for a warning. Probably not a fatal error because it is not sure, if size of `src` is not known.) Otherwise result of `strcpy` is a null-terminated string in `dest`. The next `*dest = '\0'` causes that `dest` points to an empty string, why is this needed here (it was already null terminated)? The following `strcpy` overwrites it anyway.
Well, I did not make any comments about what is going on, because it is far away of being finished, it is an alpha checker. Until we have not designed it, it is useless to write documentation, but feel free to read the discussion of the checker's evolution, so that I do not need to repeat myself. Once we have designed it, sure, I will document the design.

Let us first clear your misconceptions:

> After the first strcpy the src content is copied into dest. If src is not a null-terminated string this is itself a bug and strcpy is undefined behavior.
Cool, but this is another rule namely STR32-C: D71033. Here we do not care about that issue. Each test should be isolated and test the behavior what we expect. We do not care about another patch/rule here.

> If src has strlen(src) > 127 it is a bug again because the copy can write outside of dest past its end, into the stack frame of the function.
It is a bug in a security point of view. However, in the wild, no one really cares about it, and null-terminate the string by hand. That is why this is the test file when `WarnOnCall=false` to reduce the noise of the checker if someone is fine with such concept, but still want to catch non-null-terminated strings. So that this checker servers the needs of "enough space for holding the string" and "null-terminated strings" given by the assumption `src` is already null-terminated or `dest` is null-terminated by hand before the string is possibly read. Most of the C projects have a constant global array which could hold 1024 characters and they do not care if overflow happens, but still care if null-termination happens so truncate the string by hand.

> After this happens, writing a null terminator does not correct the other overwritten data that still can cause crash or other thing.
Tell to the C open source project (like VIM) writers they are dumb. I still want to make sure they could benefit from this checker in a non-security point of view.

> (This strcpy call is the place for a warning. Probably not a fatal error because it is not sure, if size of src is not known.) 
Nope, D71033.

> Otherwise result of strcpy is a null-terminated string in dest.
Nope, in a non-security point of view there is no issue, because the normal behavior of the program to read an IP-address or something bounded and just a truncation needed for the enter/space characters to the end. This is a normal C-programmer idiom and on projects where no-one cares about security, like VIM.

> The next *dest = '\0' causes that dest points to an empty string,
Yes.

> why is this needed here (it was already null terminated)? The following strcpy overwrites it anyway.
`*dest = '\0'` demonstrates the usual C-programmer behavior. They somehow null-terminate the string then everything is alright, no buffer-overflow read could happen. However, on the second time the null-termination may not happen since the `dest` buffer moves out of scope and possibly read by `do_something()`. If it is read in a non-null-terminated way, that is a problem in their world.

Probably you get it now why such option introduced. Feel free to try this out. The thing you will see, we cannot point out interesting properties of the bug, such as the destination buffer, so that we cannot give great bug reports, so that the checker remain alpha and useless.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70411/new/

https://reviews.llvm.org/D70411