[PATCH] D28445: [Analyzer] Extend taint propagation and checking

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 1 01:01:40 PST 2017


NoQ added a subscriber: a.sidorin.
NoQ added a comment.

Hello! Sorry, i'm very distracted recently.

I think your new approach should work, however it would be much easier and more straightforward to just ask the store manager to provide the default-bound symbol for a region directly, instead of trying to derive from it manually and then underive it back. In inline comments i also show that sometimes we're not really that much interested in the particular derived symbols.



================
Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:442
+
+    const RecordDecl *RD = RT->getDecl()->getDefinition();
+    for (const auto *I : RD->fields()) {
----------------
We need to be careful in the case when we don't have the definition in the current translation unit. In this case we may still have derived symbols by casting the pointer into some blindly guessed type, which may be primitive or having well-defined primitive fields.

By the way, in D26837 i'm suspecting that there are other errors of this kind in the checker, eg. when a function returns a void pointer, we put taint on symbols of type "void", which is weird.

Adding Alexey who may recall something on this topic.


================
Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:455
+  } else if (T->isArrayType()) {
+    const ArrayType *AT = T->getAsArrayTypeUnsafe();
+    const ElementRegion *ER =
----------------
The "safe" way to do this is to use one of the `ASTContext`'s `get***ArrayType()` methods.


================
Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:667
 
+  // If this a derived symbol, taint the parent symbol.
+  if (const SymbolDerived *SD = dyn_cast<SymbolDerived>(Sym))
----------------
While this is quite vague, i can imagine us wanting to taint only a particular field of a symbolic structure. Considering that taint also automatically propagates in the opposite direction (from parent symbol to derived symbol), i'd rather preserve some freedom by moving this code to the taint-checker (and other places where we may want to generate taint; for now, it's `getLCVSymbol()`), forcing them to manually lookup the parent symbol and put taint there when they want to.


================
Comment at: test/Analysis/taint-tester.c:53
   scanf("%d", &xy.x);
   int tx = xy.x; // expected-warning + {{tainted}}
+  int ty = xy.y; // expected-warning + {{tainted}}
----------------
I don't think you'd be able to pass this test by tweaking the taint mechanisms alone. The original FIXME here appears because the second `scanf()` destroys the first `scanf()`'s binding, together with the default-bound symbol. There's no way you preserve taint on `y` without adding taint on `z`, unless you model `scanf()` to update only specific store bindings.

In any case, this makes a good example for my comment in `addTaint()`. Because false positives are worse than false negatives, we shouldn't add taint to the default-bound symbol here.


https://reviews.llvm.org/D28445





More information about the cfe-commits mailing list