[PATCH] D144269: [Analyzer] Show "taint originated here" note of alpha.security.taint.TaintPropagation checker at the correct place

Wed Apr 5 06:30:25 PDT 2023

dkrupp added a comment.

All comments addressed. Thanks for the review @steakhal .

================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:162
+                                  const CallEvent& Call) {
+  const LocationContext* LC = Call.getCalleeStackFrame(0);
+
----------------
steakhal wrote:
> If the `Call` parameter is only used for acquiring the `LocationContext`, wouldn't it be more descriptive to directly pass the `LocationContext` to the function instead?
> I'm also puzzled that we use `getCalleeStackFrame` here. I rarely ever see this function, so I'm a bit worried if this pick was intentional. That we pass the `0` as the `BlockCount` argument only reinforces this instinct.
The call.getCalleeStackFrame(0) gets the location context of the actual call that we are analyzing (in the pre or postcall), and that's what we need to mark interesting. It is intentionally used like this. I changed the parameter to locationcontext as use suggested.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:169
+        // We give diagnostics only for taint related reports
+        if (!BR.isInteresting(LC) ||
+            BR.getBugType().getCategory() != categories::TaintedData) {
----------------
steakhal wrote:
> What does the first half of this condition guard against?
> Do you have a test for it to demonstrate?
To only follow taint propagation to function calls which actually result in tainted variables used in the report and not every function which returns a tainted variable. 

char* taintDiagnosticPropagation2() is such a test which is failing without this due to giving extra unrelated propagation notes.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:214
+                                      << TaintedArgs.at(i) << "\n");
+              Out << "Taint propagated to argument " << TaintedArgs.at(i);
+            } else {
----------------
steakhal wrote:
> So, if `TaintedSymbols.size() > 1`, then the note message will look weird.
> Could you please have a test for this?
Test added multipleTaintedArgs (). I could not provoke the multi-argument message as we only track-back one tainted symbol now.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:871
+      }else{
+         LLVM_DEBUG(llvm::dbgs() << "Strange. Return value not tainted.\n");
+         LLVM_DEBUG(Call.getReturnValue().dumpToStream(llvm::dbgs()));
----------------
steakhal wrote:
> I cannot see a test against the "Strange" string. Is this dead code?
> Same for the other block.
It was a debugging code, which I removed. I noticed that in some cases (e.g. if the argument pointer is pointing to an unknown area) we don't get back a tainted symbol even though we call the addtaint on the arg/return value.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:919-922
+    //Add taintedness to stdin parameters
+    if (isStdin(C.getSVal(E),C.getASTContext())){
+      State = addTaint(State, C.getSVal(E));
+    }
----------------
steakhal wrote:
> I just want to get it confirmed that this hunk is unrelated to your change per-say. Is it?
> BTW I don't mind this change being part of this patch, rather the opposite. Finally, we will have it.
It is related in the sense that in isTainted() function call did not return a valid SymbolRef for stdin if we did not make the stdin tainted when we first see it. Caused  testcase to fail as it was before. Now it is handled similarly to other tainted symbols.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/Taint.cpp:228-230
+      if (SymbolRef TSR =
+              isTainted(State, SRV->getRegion(), Kind))
+        return TSR;
----------------
steakhal wrote:
> What does `TSR` abbreviate? I would find `TaintedSym` more descriptive.
TSR = Tainted Symbol Ref

but I changed it as you suggested.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/VLASizeChecker.cpp:220-228
+    if (Kind == VLA_Tainted)
+      BT.reset(new BugType(this,
+                           "Dangerous variable-length array (VLA) declaration",
+                           categories::TaintedData));
+    else
+      BT.reset(new BugType(this,
+                           "Dangerous variable-length array (VLA) declaration",
----------------
steakhal wrote:
> Why don't we use a distinct BugType for this?
You mean a new bug type instances? Would there be an advantage for that? Seemed to be simpler this way. To distinguish identify the tainted reports with the bug category.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144269/new/

https://reviews.llvm.org/D144269