[PATCH] D144269: [Analyzer] Show "taint originated here" note of alpha.security.taint.TaintPropagation checker at the correct place

Daniel Krupp via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 17 07:36:15 PST 2023


dkrupp created this revision.
dkrupp added a reviewer: Szelethus.
dkrupp added a project: clang.
Herald added subscribers: steakhal, manas, ASDenysPetrov, martong, gamesh411, donat.nagy, mikhail.ramalho, a.sidorin, JDevlieghere, rnkovacs, szepet, baloghadamsoftware, xazax.hun.
Herald added a reviewer: NoQ.
Herald added a project: All.
dkrupp requested review of this revision.
Herald added a subscriber: cfe-commits.

This patch improves the diagnostics of the alpha.security.taint.TaintPropagation and taint related checkers by showing the "Taint originated here" note at the correct place, where the attacker may inject it. This greatly improves the understandability of the taint reports.

Taint Analysis: The attacker injects the malicious data at the taint source (e.g. getenv() call) which is then propagated and used at taint sink (e.g. exec() call) causing a security vulnerability (e.g. shell injection vulnerability), without data sanitation.

The goal of the checker is to discover and show to the user these potential taint source, sink pairs and the propagation call chain.

In the baseline the taint source was pointing to an invalid location, typically somewhere between the real taint source and sink.

After the fix, the "Taint originated here" tag is correctly shown at the taint source. This is the function call where the attacker can inject a malicious data (e.g. reading from environment variable, reading from file, reading from standard input etc.).

Before the patch the clang static analyzer puts the taint origin note wrongly to the `strtol(..)` call.

  c
  int main(){
    char *pathbuf;
    char *user_data=getenv("USER_INPUT"); 
    char *end;  
    long size=strtol(user_data, &end, 10); // note: Taint originated here. 
    if (size > 0){
      pathbuf=(char*) malloc(size+1);//note: Untrusted data is used to specify the buffer size ...
      // ... 
      free(pathbuf);
    }
    return 0;
  }

After the fix, the taint origin point is correctly annotated at `getenv()` where the attacker really injects the value.

  c
  int main(){
    char *pathbuf;
    char *user_data=getenv("USER_INPUT");  // note: Taint originated here. 
    char *end;
    long size=strtol(user_data, &end, 10); 
    if (size > 0){
      pathbuf=(char*) malloc(size+1);//note: Untrusted data is used to specify the buffer size ...
      // ... 
      free(pathbuf);
    }
    return 0;
  }

The BugVisitor placing the note was wrongly going back only until introduction of the tainted SVal in the sink.

This patch creates a new uniquely identified taint flow for each taint source (e.g.getenv()) it traverses and places a NoteTag ("Taint originated here") with the new id. Then, when the bug report is generated, the taint flow id is propagated back (in the new TainBugReport) along the bug path and the correct "Taint originated here." annotation is generated (matching the flow id).

You can find the new improved reports

here <https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?review-status=Unreviewed&review-status=Confirmed%20bug&detection-status=New&detection-status=Reopened&detection-status=Unresolved&run=%2ataint_improvement_fixed&is-unique=off&diff-type=New&checker-msg=%2auntrusted%2a&checker-msg=%2ataint%2a>

And the old reports (look out for "Taint originated here" notes. They are at the wrong place, close to the end of the reports)

here <https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?review-status=Unreviewed&review-status=Confirmed%20bug&detection-status=New&detection-status=Reopened&detection-status=Unresolved&run=%2ataint_improvement_baseline&is-unique=off&diff-type=New&checker-msg=%2auntrusted%2a&checker-msg=%2ataint%2a>


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D144269

Files:
  clang/include/clang/StaticAnalyzer/Checkers/Taint.h
  clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
  clang/include/clang/StaticAnalyzer/Core/BugReporter/CommonBugCategories.h
  clang/include/clang/StaticAnalyzer/Core/BugReporter/TaintBugType.h
  clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
  clang/lib/StaticAnalyzer/Checkers/DivZeroChecker.cpp
  clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
  clang/lib/StaticAnalyzer/Checkers/Taint.cpp
  clang/lib/StaticAnalyzer/Checkers/VLASizeChecker.cpp
  clang/lib/StaticAnalyzer/Core/BugReporter.cpp
  clang/lib/StaticAnalyzer/Core/CommonBugCategories.cpp
  clang/test/Analysis/diagnostics/Inputs/expected-sarif/sarif-diagnostics-taint-test.c.sarif
  clang/test/Analysis/diagnostics/Inputs/expected-sarif/sarif-multi-diagnostic-test.c.sarif
  clang/test/Analysis/diagnostics/sarif-diagnostics-taint-test.c
  clang/test/Analysis/taint-checker-callback-order-has-definition.c
  clang/test/Analysis/taint-checker-callback-order-without-definition.c
  clang/test/Analysis/taint-diagnostic-visitor.c
  clang/test/Analysis/taint-dumps.c

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D144269.498366.patch
Type: text/x-patch
Size: 54980 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20230217/3fc2d53d/attachment-0001.bin>


More information about the cfe-commits mailing list