[PATCH] D144269: [Analyzer] Show "taint originated here" note of alpha.security.taint.TaintPropagation checker at the correct place

Daniel Krupp via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 14 05:58:28 PDT 2023


dkrupp added a comment.

All remarks from @steakhal has been fixed. Thanks for the review.
This new version now can handle the tracking back of multiple symbols!



================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:129-130
 /// Given a pointer/reference argument, return the value it refers to.
-std::optional<SVal> getPointeeOf(const CheckerContext &C, SVal Arg) {
+std::optional<SVal> getPointeeOf(ASTContext &ASTCtx,
+                                 const ProgramStateRef State, SVal Arg) {
   if (auto LValue = Arg.getAs<Loc>())
----------------
steakhal wrote:
> BTW I don't know but `State->getStateManager().getContext()` can give you an `ASTContext`. And we tend to not put `const` to variable declarations. See [[ https://releases.llvm.org/4.0.0/tools/clang/tools/extra/docs/clang-tidy/checks/readability-avoid-const-params-in-decls.html | readability-avoid-const-params-in-decls ]]
> 
> In other places we tend to refer to `ASTContext` by the `ACtx` I think.
> We also prefer const refs over mutable refs. Is the mutable ref justified for this case?
Thanks for the suggestion. I took out ASTContext from the signature.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:209-214
+          if (nofTaintedArgs == 0)
+            Out << "Taint propagated to argument "
+                << TaintedArgs.at(Sym.index()) + 1;
+          else
+            Out << ", " << TaintedArgs.at(Sym.index()) + 1;
+          nofTaintedArgs++;
----------------
steakhal wrote:
> I'd recommend using `llvm::interleaveComma()` in such cases.
> You can probably get rid of `nofTaintedArgs` as well - by using this function.
I chose another solution. I hope that is ok too.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:213
+          else
+            Out << ", " << TaintedArgs.at(Sym.index()) + 1;
+          nofTaintedArgs++;
----------------
steakhal wrote:
> I believe this branch is uncovered by tests.
Now it is covered. See multipleTaintedSArgs(..) test in taint-diagnostic-visitor.c


================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:221
+    }
+    return std::string(Out.str());
+  });
----------------
steakhal wrote:
> I think since you explicitly specify the return type of the lambda, you could omit the spelling of `std::string` here.
not sure. Got a "cannot convert raw_svector_ostream::str() from llvm:StringRef" error.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:875-878
+      // FIXME: The call argument may be a complex expression
+      // referring to multiple tainted variables.
+      // Now we generate notes and track back only one of them.
+      SymbolRef TaintedSym = isTainted(State, *V);
----------------
steakhal wrote:
> You could iterate over the symbol dependencies of the SymExpr (of the `*V` SVal).
> 
> ```lang=c++
> SymbolRef PointeeAsSym = V->getAsSymbol();
> // eee, can it be null? Sure it can. See isTainted(Region),... for those cases we would need to descend and check their symbol dependencies.
> for (SymbolRef SubSym : llvm::make_range(PointeeAsSym->symbol_begin(), PointeeAsSym->symbol_end())) {
>   // TODO: check each if it's also tainted, and update the `TaintedSymbols` accordingly, IDK.
> }
> ```
> Something like this should work for most cases (except when `*V` refers to a tainted region instead of a symbol), I think.
I implememented a new function getTaintedSymbols(..) in Taint.cpp which returns all tainted symbols for a complex expr, SVal etc. With this addition, now we can track back multiple tainted symbols reaching a sink.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144269/new/

https://reviews.llvm.org/D144269



More information about the cfe-commits mailing list