[PATCH] D59516: [analyzer] Add custom filter functions for GenericTaintChecker

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Nov 7 16:55:54 PST 2019


NoQ accepted this revision.
NoQ added a comment.

@NoQ: "Why not simply remove taint?"
@boga95: //*removes taint*//
@NoQ: "Hmm, now that i think about it, adding a 'no taint' marker might actually work correctly more often."

Like, if you have taint on `$x` and try to remove taint from `derived<$x, x.a>`, your current implementation will do nothing. But the approach with adding a 'no taint' marker will actually add a new marker and make subsequent lookups to `derived<$x, x.a>` will return the newly added marker, which is the correct behavior; additionally, `derived<$x, x.b>` would remain tainted (where `b != a`), which is also the correct behavior. It would have still failed when you describe the sub-region slightly differently, but that'd be a fairly minor glitch.

The right way to proceed further with the `removeTaint()` approach on `SymbolDerived` is to introduce `removePartialTaint()` that would complement `addPartialTaint()`. But that will require changing the data structure in the program state from a simple set of tainted sub-regions to a sophisticated tree of sub-regions that are marked up as tainted or not tainted. Which might have as well been a marker.

I still tend to believe that `removeTaint()` is the right approach, but it's a bit harder to get right and a bit worse if not enough effort is invested into it.

There definitely is, however, a good use for the `removeTaint()` function in both approaches: namely, our taint checker still lacks `checkDeadSymbols` :D



================
Comment at: clang/lib/StaticAnalyzer/Checkers/Taint.cpp:111-112
+ProgramStateRef taint::removeTaint(ProgramStateRef State, SymbolRef Sym) {
+  // If this is a symbol cast, remove the cast before adding the taint. Taint
+  // is cast agnostic.
+  while (const SymbolCast *SC = dyn_cast<SymbolCast>(Sym))
----------------
That's not entirely true. Like, if you check `(char)x`, you still have 24 bytes of `x` controlled by the attacker. But that's a good false-positive-proof approach.


================
Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:474-476
+  const FunctionDecl *FDecl = C.getCalleeDecl(CE);
+  if (!FDecl || FDecl->getKind() != Decl::Function)
+    return;
----------------
boga95 wrote:
> Szelethus wrote:
> > When do these happen? For implicit functions in C? 
> For example, a lambda doesn't have an FunctionDecl.
Lambda most certainly has a `FunctionDecl`, which is the declaration of its `operator()()`, and that's exactly what's going to be in `FDecl` if a lambda is invoked. However, the `getKind()` of this `FunctionDecl` will not be `Decl::Function` but `Decl::CXXMethod`, like of any other member function.

So this second check is checking that the function is a simple global function.

I recommend replacing with `!isa<CXXMethodDecl>(FDecl)`, purely for readability, or at least adding a comment.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59516/new/

https://reviews.llvm.org/D59516





More information about the cfe-commits mailing list