[PATCH] D30909: [Analyzer] Finish taint propagation to derived symbols of tainted regions

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 10 02:57:46 PDT 2017

NoQ added a comment.

This is fantastic, thanks! I really like the shape of how it turned out to work.

Minor stuff and we're landing :)

Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:651
+ProgramStateRef ProgramState::addTaint(SVal V,
+                                           TaintTagType Kind) const {
+  SymbolRef Sym = V.getAsSymbol();
Whitespace seems a bit off.

Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:656-659
+  // If the SVal is a LazyCompoundVal it might only cover sub-region of a given
+  // symbol. For example, the LCV might represent a field in an uninitialized
+  // struct. In this case, the LCV encapsulates a conjured symbol and reference
+  // to the sub-region representing the struct field.
I still feel bad about producing API with very tricky pre-conditions. The LCV may have various forms - some may have empty store with no symbols at all, and others may be full of direct bindings that make the symbol essentially irrelevant. However, because the taint API is designed to be defensive to cases when taint cannot be added, and it sounds like a good thing, i guess we've taken the right approach here :)

I suggest commenting this more thoroughly though, something like:

> If the SVal represents a structure, try to mass-taint all values within the structure. For now it only works efficiently on lazy compound values that were conjured during a conservative evaluation of a function - either as return values of functions that return structures or arrays by value, or as values of structures or arrays passed into the function by reference, directly or through pointer aliasing. Such lazy compound values are characterized by having exactly one binding in their captured store within their parent region, which is a conjured symbol default-bound to the base region of the parent region.

Then inside `if (Sym)`:

> If the parent region is a base region, we add taint to the whole conjured symbol.

> Otherwise, when the value represents a record-typed field within the conjured structure, so we add partial taint for that symbol to that field.

Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:701-703
+  // Ignore partial taint if the entire parent symbol is already tainted.
+  if (contains<TaintMap>(ParentSym))
+    return this;
Speaking of taint tags: right now we didn't add support for multiple taint tags per symbol (because we have only one taint tag to choose from), but `addTaint` overwrites the tag. I guess we should do the same for now.

Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:715
+  TaintedSubRegions = TaintedSubRegions.add(SubRegion, Kind);
+  return set<DerivedSymTaint>(ParentSym, TaintedSubRegions);
Can we assert that the returned state is not empty, like in `addTaint`?


More information about the cfe-commits mailing list