[PATCH] D101526: [analyzer][StdLibraryFunctionsChecker] Add NoteTags for applied arg constraints

Gabor Marton via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Sep 15 03:09:19 PDT 2022


martong marked 7 inline comments as done.
martong added inline comments.
Herald added a project: All.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp:156-157
     // the bug is reported.
-    virtual std::string describe(ProgramStateRef State,
+    virtual std::string describe(DescritptionKind DK, ProgramStateRef State,
                                  const Summary &Summary) const {
       // There are some descendant classes that are not used as argument
----------------
Szelethus wrote:
> How about we turn this into a print-like function and instead of returning with a string, we take an `llvm::raw_ostream` object as argument? `SmallString` + `raw_svector_stream` is how we construct most of our checker message strings.
I don't see how that change would be relevant. Would we have a better run-time, or code that is easier to understand? please elaborate.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp:854
+          NewState, NewNode,
+          C.getNoteTag([Msg, ArgSVal](PathSensitiveBugReport &BR,
+                                      llvm::raw_ostream &OS) {
----------------
steakhal wrote:
> 
Thanks, changed it.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp:857
+            if (BR.isInteresting(ArgSVal))
+              OS << Msg;
+          }));
----------------
NoQ wrote:
> Szelethus wrote:
> > steakhal wrote:
> > > Ah, there is a slight issue.
> > > You should mark some stuff interesting here, to make this interestingness propagate back transitively.
> > > 
> > > Let's say `ArgSVal` is `x + y` which is considered to be out of range `[42,52]`. We should mark both `x` and `y` interesting because they themselves could have been constrained by the StdLibChecker previously. So, they must be interesting as well.
> > > 
> > > On the same token, IMO `PathSensitiveBugReport::markInteresting(symbol)` should be //transitive//. So that all `SymbolData` in that symbolic expression tree are considered interesting. What do you think @NoQ?
> > > If we were doing this, @martong  - you could simply acquire the assumption you constructed for the given `ValueConstraint`, and mark that interesting. Then all `SymbolData`s on both sides of the logic operator would become implicitly interesting.
> > >On the same token, IMO PathSensitiveBugReport::markInteresting(symbol) should be transitive. So that all SymbolData in that symbolic expression tree are considered interesting. What do you think @NoQ?
> > 
> > Thats how I'd expect this to work. This shouldn't be a burden on the checker developer (certainly not this kind of a checker), but rather be handled by `PathSensitiveBugReport`.
> > 
> > So I think this is fine as it is.
> Interestingness isn't a thing on its own; its meaning is entirely tied to the nature of the specific bug report. An interesting pointer in the null dereference checker is absolutely not the same thing as an interesting mutex pointer in `PthreadLockChecker` or a (de)allocated pointer in `MallocChecker`.
> 
> I currently treat interestingness as a "GDM for visitors". It's their own way of communicating with themselves and with each other, a state they keep track of and update as they visit the bug report (initially populated during construction of the bug report). But the meaning of this state is entirely specific to the visitors. It is visitors who give interestingness a meaning (and the visitors are, naturally, also hand-picked during construction of the bug report).
> 
> So I think the right question to ask is "what do you want interestingness to mean in your checker?" and build your visitors accordingly.
> 
> Your visitors should provide enough information for the user to be able to understand the bug report. When the report says "$x + $y is in range [42, 52] and no values in that range are a valid input to that function", the user asks "why do you think $x + $y is in range [42, 52]?" and we'll have to answer that.
> 
> For example, in
> ```lang=c++
> if (x + y >= 42 && x + y <= 52)
>   foo(x + y);
> ```
> there's no need to track ranges for $x and $y separately; it is sufficient to point the user to the constraint over $x + $y obtained from the if-statement. On the other hand, in
> ```lang=c++
> if (x >= 44 && x <= 50)
>   if (y >= -2 && y <= 2)
>     foo(x + y);
> ```
> you'll have to explain both $x and $y in order for the user to understand that $x + $y is indeed in range [42, 52].
> 
> There are also other funny edge cases depending on the nature of the arithmetic, such as
> ```
> int z = x * y;
> if (x == 0)
>   return 1 / z;
> ```
> where in order to explain the division by the zero value $x * $y it is sufficient to explain $x which makes $y redundant. And if they both are zero then we should probably flip a coin?
> 
> So I think this is a non-trivial problem. Even on the examples above (let alone other bug types!) it's easy to see that interestingness of $x and $y doesn't always follow from the interestingness of $x + $y (but sometimes it indeed does). I think the answer lies somewhere in the underneathies of the constraint solver: we have to follow its logic in order to find out how the range was inferred (which it probably should annotate with more note tags so that we didn't have to reverse-engineer it in the visitor?) (@vsavchenko you may find this discussion particularly peculiar).
Guys, the thing is, we should have our discussion about "transitive interestingness" somewhere else.

This patch is orthogonal to that problem. Actually, in this patch I add extra notes to already interesting SVals. Those SVals are marked to be interesting by other checkers. As @NoQ describes, I agree, that a checker could be responsible to describe how it wants to handle transitivity. But, once again, it is not the StdLibraryFunctionChecker that is marking the SVals interesting. Please take a look at the newly added test file: We have a division by zero there, thus the divisor is marked interesting by the DivZeroChecker (and transitivity is handled by trackExpressionValue). What we do in this patch is if we know that a value to be interesting and we know that had been constrained by an argument constraint, then we attach a note that describes this fact.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101526/new/

https://reviews.llvm.org/D101526



More information about the cfe-commits mailing list