[PATCH] D69813: [analyzer] CERTStrChecker: Model gets()

Tue Nov 5 09:42:47 PST 2019

Szelethus added a comment.

Hmm, so this checker is rather a collection of CERT rule checkers, right? Shouldn't the checker name contain the actual rule name (STR31-C)? User interfacewise, I would much prefer smaller, leaner checkers than a big one with a lot of options, which are barely supported for end-users. I would expect a `cert` package to contain subpackages like `cert.str`, and checker names `cert.str.31StringSize`, or similar. Also, shouldn't we move related checkers from `security.insecureAPI` to `cert`? Or just mention the rule name in the description, and continue not having a  `cert` package?

================
Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:70
 def Security : Package <"security">;
+def CERT : Package<"cert">, ParentPackage<Security>;
 def InsecureAPI : Package<"insecureAPI">, ParentPackage<Security>;
----------------
Hmm. We have a variety of checkers that check for a CERT rule. Maybe we should put the rest here as well, if would better follow the clang-tidy interface.

I'll ask around in the office.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/AllocationState.h:29
+/// \returns The MallocBugVisitor.
+std::unique_ptr<BugReporterVisitor> getMallocBRVisitor(SymbolRef Sym);
+
----------------
I would prefer if this header file didn't exist, or was thought out better, because its messy that we hide `MallocChecker`, but expose its guts like this.

The change is fine, this is just a critique of the checker.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:184
+  if (IsFix) {
+    if (Optional<std::string> SizeStr = getSizeExprAsString(Call, CallC, C)) {
+      renameFunctionFix(UseSafeFunctions ? "gets_s" : "fgets", Call, *Report);
----------------
Charusso wrote:
> NoQ wrote:
> > Charusso wrote:
> > > NoQ wrote:
> > > > Also, which is probably more important, you will never be able to provide a fixit for the malloced memory case, because there may be multiple execution paths that reach the current point with different size expressions (in fact, not necessarily all of them are malloced).
> > > > 
> > > > Eg.:
> > > > ```lang=c
> > > > char *x = 0;
> > > > char y[10];
> > > > 
> > > > if (coin()) {
> > > >   x = malloc(20);
> > > > } else {
> > > >   x = y;
> > > > }
> > > > 
> > > > gets(x);
> > > > ```
> > > > 
> > > > If you suggest replacing `gets(x)` with `gets_s(x, 20)`, you'll still have a buffer overflow on the else-branch on which `x` points to an array of 10 bytes.
> > > This checker going to evolve a lot, and all of the checked function calls have issues like that. I do not even think what else issues they have. I would like to cover the false alarm suppression when we are about to alarm. Is it would be okay? I really would like to see alarms first.
> > > 
> > > For example, I have seen stuff in the wild so that I can state out 8-param allocators and we need to rely on the checkers provide information about allocation.
> > *summons @Szelethus*
> > 
> > Apart from the obviously syntactic cases, you might actually be able to implement fixits for the situation when the reaching-definitions analysis displays exactly one definition for `x`, which additionally coincides with the allocation site. If that definition is a simple assignment, you'll be able to re-run the reaching definitions analysis for the RHS of that assignment. If that definition comes from a function call, you might be able to re-run the reaching definitions analysis on the return statement(s) of that function (note that this function must have been inlined during path-sensitive analysis, otherwise no definition in it would coincide with the allocation site). And so on.
> > 
> > This problem sheds some light on how much do we want to make the reaching definitions analysis inter-procedural. My current guess is that we probably don't need to; we'd rather have this guided by re-running the reaching-definitions analysis based on the path-sensitive report data, than have the reaching-definitions analysis be inter-procedural on our own.
> That is a cool idea! I hope @Szelethus has time for his project.
This sounds very cool! Once we're at the bug report construction phase, we can make reaching definitions analysis "interprocedural enough" for most cases, I believe.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69813/new/

https://reviews.llvm.org/D69813