[PATCH] D144269: [Analyzer] Show "taint originated here" note of alpha.security.taint.TaintPropagation checker at the correct place
Balázs Benics via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Apr 21 06:59:20 PDT 2023
steakhal requested changes to this revision.
steakhal added a comment.
This revision now requires changes to proceed.
I didn't go through the whole revision this time, but I think the next steps are already clear for the next round.
My impression was that I might not expressed my intent and expectations about the directions of the next step.
I hope I managed this time. Let me know if you have questions.
================
Comment at: clang/include/clang/StaticAnalyzer/Checkers/Taint.h:82-103
+/// Returns the tainted Symbols for a given Statement and state.
+std::vector<SymbolRef> getTaintedSymbols(ProgramStateRef State, const Stmt *S,
+ const LocationContext *LCtx,
+ TaintTagType Kind = TaintTagGeneric,
+ bool returnFirstOnly = false);
+
+/// Returns the tainted Symbols for a given SVal and state.
----------------
The overloads having the extra `ReturnFirstOnly` parameter shouldn't be visible here in the header.
That is an implementation detail that no users should know about.
Note that having a single default argument overload potentially doubles the variations the user might need to keep in mind when choosing the right one. So there is value in simplicity.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:248-254
+ std::vector<SymbolRef> TaintedSyms =
+ getTaintedSymbols(errorState, TaintedSVal);
+ // Mark all tainted symbols interesting
+ // to track back the propagation of taintedness.
+ for (auto Sym : TaintedSyms) {
+ BR->markInteresting(Sym);
+ }
----------------
================
Comment at: clang/lib/StaticAnalyzer/Checkers/DivZeroChecker.cpp:62
+ CheckerContext &C,
+ std::vector<SymbolRef> TaintedSyms) const {
+ if (ExplodedNode *N = C.generateErrorNode(StateZero)) {
----------------
================
Comment at: clang/lib/StaticAnalyzer/Checkers/DivZeroChecker.cpp:108-109
+ if ((stateNotZero && stateZero)) {
+ std::vector<SymbolRef> taintedSyms = getTaintedSymbols(C.getState(), *DV);
+ if (!taintedSyms.empty()) {
+ reportTaintBug("Division by a tainted value, possibly zero", stateZero, C,
----------------
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:863-868
+ std::vector<SymbolRef> TaintedSyms =
+ getTaintedSymbols(State, Call.getReturnValue());
+ if (!TaintedSyms.empty()) {
+ TaintedSymbols.push_back(TaintedSyms[0]);
+ TaintedIndexes.push_back(ArgNum);
+ }
----------------
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:875-879
+ std::vector<SymbolRef> TaintedSyms = getTaintedSymbols(State, *V);
+ if (!TaintedSyms.empty()) {
+ TaintedSymbols.push_back(TaintedSyms[0]);
+ TaintedIndexes.push_back(ArgNum);
+ }
----------------
In these cases, the code would acquire all the tainted subsymbols, which then we throw away and keep only the first one.
This is why I suggested the approach I did I'm my last review.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:936-940
+ IsMatching =
+ IsMatching || (PropSrcArgs.contains(I) &&
+ isTaintedOrPointsToTainted(State, C.getSVal(E)));
+ std::optional<SVal> TaintedSVal =
+ getTaintedPointeeOrPointer(State, C.getSVal(E));
----------------
Here `getTaintedPointeeOrPointer` would be called two times, unnecessarily.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:947-948
+ if (!TaintedArgSyms.empty()) {
+ TaintedSymbols.insert(TaintedSymbols.begin(), TaintedArgSyms.begin(),
+ TaintedArgSyms.end());
+ TaintedIndexes.push_back(I);
----------------
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:1009-1010
assert(E);
- std::optional<SVal> TaintedSVal{getTaintedPointeeOrPointer(C, C.getSVal(E))};
+ std::optional<SVal> TaintedSVal{
+ getTaintedPointeeOrPointer(C.getState(), C.getSVal(E))};
----------------
================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:1019-1023
+ std::vector<SymbolRef> TaintedSyms =
+ getTaintedSymbols(C.getState(), *TaintedSVal);
+ for (auto TaintedSym : TaintedSyms) {
+ report->markInteresting(TaintedSym);
+ }
----------------
================
Comment at: clang/lib/StaticAnalyzer/Checkers/Taint.cpp:149
const LocationContext *LCtx, TaintTagType Kind) {
- SVal val = State->getSVal(S, LCtx);
- return isTainted(State, val, Kind);
+ return !getTaintedSymbols(State, S, LCtx, Kind, true).empty();
}
----------------
We usually pass booleans by "name".
================
Comment at: clang/lib/StaticAnalyzer/Checkers/Taint.cpp:294
+ std::vector<SymbolRef> TaintedCasts =
+ getTaintedSymbols(State, SC->getOperand(), Kind);
+ TaintedSymbols.insert(TaintedSymbols.begin(), TaintedCasts.begin(),
----------------
If `returnFirstOnly` is `true`, this `getTaintedSymbols()` call would still eagerly (needlessly) collect all of the symbols.
I'd recommend propagating the `returnFirstOnly` parameter to the recursive calls to avoid this problem.
I also encourage you to make use of the `llvm::append_range()` whenever makes sense.
================
Comment at: clang/lib/StaticAnalyzer/Checkers/VLASizeChecker.cpp:230-232
+ std::vector<SymbolRef> TaintedSyms = getTaintedSymbols(State, TaintedSVal);
+ for (auto Sym : TaintedSyms)
+ report->markInteresting(Sym);
----------------
Ah, how awesome it would be to have a `markInteresting(llvm::ArrayRef<SymbolRef>)` overload.
================
Comment at: clang/test/Analysis/taint-diagnostic-visitor.c:53-67
+//Tests if the originated note is correctly placed even if the path is
+//propagating through variables and expressions
+char* taintDiagnosticPropagation(){
+ char *pathbuf;
+ char *pathlist=getenv("PATH"); // expected-note {{Taint originated here}}
+ // expected-note at -1 {{Taint propagated to the return value}}
+ if (pathlist){ // expected-note {{Assuming 'pathlist' is non-null}}
----------------
steakhal wrote:
> I know this is subjective, but I'd suggest to reformat the tests to match LLVM style guidelines, unless the formatting is important for the test.
> Consistency helps the reader and reviewer, as code and tests are read many more times than written.
>
> This applies to the rest of the touched tests.
Originally I meant this to the rest of the test cases you change or add part of this patch. I hope it clarifies.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144269/new/
https://reviews.llvm.org/D144269
More information about the cfe-commits
mailing list