[PATCH] D69813: [analyzer][WIP] CERTStrChecker: Model gets()

Mon Nov 4 14:44:50 PST 2019

NoQ added inline comments.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:184
+  if (IsFix) {
+    if (Optional<std::string> SizeStr = getSizeExprAsString(Call, CallC, C)) {
+      renameFunctionFix(UseSafeFunctions ? "gets_s" : "fgets", Call, *Report);
----------------
Charusso wrote:
> NoQ wrote:
> > Also, which is probably more important, you will never be able to provide a fixit for the malloced memory case, because there may be multiple execution paths that reach the current point with different size expressions (in fact, not necessarily all of them are malloced).
> > 
> > Eg.:
> > ```lang=c
> > char *x = 0;
> > char y[10];
> > 
> > if (coin()) {
> >   x = malloc(20);
> > } else {
> >   x = y;
> > }
> > 
> > gets(x);
> > ```
> > 
> > If you suggest replacing `gets(x)` with `gets_s(x, 20)`, you'll still have a buffer overflow on the else-branch on which `x` points to an array of 10 bytes.
> This checker going to evolve a lot, and all of the checked function calls have issues like that. I do not even think what else issues they have. I would like to cover the false alarm suppression when we are about to alarm. Is it would be okay? I really would like to see alarms first.
> 
> For example, I have seen stuff in the wild so that I can state out 8-param allocators and we need to rely on the checkers provide information about allocation.
*summons @Szelethus*

Apart from the obviously syntactic cases, you might actually be able to implement fixits for the situation when the reaching-definitions analysis displays exactly one definition for `x`, which additionally coincides with the allocation site. If that definition is a simple assignment, you'll be able to re-run the reaching definitions analysis for the RHS of that assignment. If that definition comes from a function call, you might be able to re-run the reaching definitions analysis on the return statement(s) of that function (note that this function must have been inlined during path-sensitive analysis, otherwise no definition in it would coincide with the allocation site). And so on.

This problem sheds some light on how much do we want to make the reaching definitions analysis inter-procedural. My current guess is that we probably don't need to; we'd rather have this guided by re-running the reaching-definitions analysis based on the path-sensitive report data, than have the reaching-definitions analysis be inter-procedural on our own.

Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69813/new/

https://reviews.llvm.org/D69813