[PATCH] D144269: [Analyzer] Show "taint originated here" note of alpha.security.taint.TaintPropagation checker at the correct place
Daniel Krupp via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Apr 14 05:58:28 PDT 2023
dkrupp added a comment.
All remarks from @steakhal has been fixed. Thanks for the review.
This new version now can handle the tracking back of multiple symbols!
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:129-130
/// Given a pointer/reference argument, return the value it refers to.
-std::optional<SVal> getPointeeOf(const CheckerContext &C, SVal Arg) {
+std::optional<SVal> getPointeeOf(ASTContext &ASTCtx,
+ const ProgramStateRef State, SVal Arg) {
if (auto LValue = Arg.getAs<Loc>())
----------------
steakhal wrote:
> BTW I don't know but `State->getStateManager().getContext()` can give you an `ASTContext`. And we tend to not put `const` to variable declarations. See [[ https://releases.llvm.org/4.0.0/tools/clang/tools/extra/docs/clang-tidy/checks/readability-avoid-const-params-in-decls.html | readability-avoid-const-params-in-decls ]]
>
> In other places we tend to refer to `ASTContext` by the `ACtx` I think.
> We also prefer const refs over mutable refs. Is the mutable ref justified for this case?
Thanks for the suggestion. I took out ASTContext from the signature.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:209-214
+ if (nofTaintedArgs == 0)
+ Out << "Taint propagated to argument "
+ << TaintedArgs.at(Sym.index()) + 1;
+ else
+ Out << ", " << TaintedArgs.at(Sym.index()) + 1;
+ nofTaintedArgs++;
----------------
steakhal wrote:
> I'd recommend using `llvm::interleaveComma()` in such cases.
> You can probably get rid of `nofTaintedArgs` as well - by using this function.
I chose another solution. I hope that is ok too.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:213
+ else
+ Out << ", " << TaintedArgs.at(Sym.index()) + 1;
+ nofTaintedArgs++;
----------------
steakhal wrote:
> I believe this branch is uncovered by tests.
Now it is covered. See multipleTaintedSArgs(..) test in taint-diagnostic-visitor.c
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:221
+ }
+ return std::string(Out.str());
+ });
----------------
steakhal wrote:
> I think since you explicitly specify the return type of the lambda, you could omit the spelling of `std::string` here.
not sure. Got a "cannot convert raw_svector_ostream::str() from llvm:StringRef" error.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:875-878
+ // FIXME: The call argument may be a complex expression
+ // referring to multiple tainted variables.
+ // Now we generate notes and track back only one of them.
+ SymbolRef TaintedSym = isTainted(State, *V);
----------------
steakhal wrote:
> You could iterate over the symbol dependencies of the SymExpr (of the `*V` SVal).
>
> ```lang=c++
> SymbolRef PointeeAsSym = V->getAsSymbol();
> // eee, can it be null? Sure it can. See isTainted(Region),... for those cases we would need to descend and check their symbol dependencies.
> for (SymbolRef SubSym : llvm::make_range(PointeeAsSym->symbol_begin(), PointeeAsSym->symbol_end())) {
> // TODO: check each if it's also tainted, and update the `TaintedSymbols` accordingly, IDK.
> }
> ```
> Something like this should work for most cases (except when `*V` refers to a tainted region instead of a symbol), I think.
I implememented a new function getTaintedSymbols(..) in Taint.cpp which returns all tainted symbols for a complex expr, SVal etc. With this addition, now we can track back multiple tainted symbols reaching a sink.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144269/new/
https://reviews.llvm.org/D144269
More information about the cfe-commits
mailing list