[PATCH] D30909: [Analyzer] Finish taint propagation to derived symbols of tainted regions

Vlad Tsyrklevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu May 11 23:39:39 PDT 2017


vlad.tsyrklevich added inline comments.


================
Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:656-659
+  // If the SVal is a LazyCompoundVal it might only cover sub-region of a given
+  // symbol. For example, the LCV might represent a field in an uninitialized
+  // struct. In this case, the LCV encapsulates a conjured symbol and reference
+  // to the sub-region representing the struct field.
----------------
NoQ wrote:
> I still feel bad about producing API with very tricky pre-conditions. The LCV may have various forms - some may have empty store with no symbols at all, and others may be full of direct bindings that make the symbol essentially irrelevant. However, because the taint API is designed to be defensive to cases when taint cannot be added, and it sounds like a good thing, i guess we've taken the right approach here :)
> 
> I suggest commenting this more thoroughly though, something like:
> 
> > If the SVal represents a structure, try to mass-taint all values within the structure. For now it only works efficiently on lazy compound values that were conjured during a conservative evaluation of a function - either as return values of functions that return structures or arrays by value, or as values of structures or arrays passed into the function by reference, directly or through pointer aliasing. Such lazy compound values are characterized by having exactly one binding in their captured store within their parent region, which is a conjured symbol default-bound to the base region of the parent region.
> 
> Then inside `if (Sym)`:
> 
> > If the parent region is a base region, we add taint to the whole conjured symbol.
> 
> > Otherwise, when the value represents a record-typed field within the conjured structure, so we add partial taint for that symbol to that field.
The pre-conditions for using the API are actually a bit simpler than what's exposed here. I made it explicit to make the logic for tainting LCVs explicit, but the following simpler logic works:
```
  if (auto LCV = V.getAs<nonloc::LazyCompoundVal>()) {
    if (Optional<SVal> binding = getStateManager().StoreMgr->getDefaultBinding(*LCV)) {
      if (SymbolRef Sym = binding->getAsSymbol()) {
        return addPartialTaint(Sym, LCV->getRegion(), Kind);
      }
    }
  }
```

This works because `addPartialTaint()` actually already performs the 'getRegion() == getRegion()->getBaseRegion()` check already and taints the parent symbol if the region is over the base region already. I chose to replicate it here to make the logic more explicit, but now that I've written this down the overhead of duplicating the logic seems unnecessary. Do you agree?


================
Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:701-703
+  // Ignore partial taint if the entire parent symbol is already tainted.
+  if (contains<TaintMap>(ParentSym))
+    return this;
----------------
NoQ wrote:
> Speaking of taint tags: right now we didn't add support for multiple taint tags per symbol (because we have only one taint tag to choose from), but `addTaint` overwrites the tag. I guess we should do the same for now.
I believe this is the current behavior. On line 714 I presume ImmutableMap::add overrides a key if it's already present in the map but I couldn't trace down the Tree ADT implementation to confirm this.


https://reviews.llvm.org/D30909





More information about the cfe-commits mailing list