[clang] [analyzer] Moving TaintPropagation checker out of alpha (PR #67352)

Daniel Krupp via cfe-commits cfe-commits at lists.llvm.org
Mon Oct 9 10:29:07 PDT 2023


dkrupp wrote:

@haoNoQ  thanks for pointing out #61826 umbrella issue, I somehow missed that.

I see this TaintPropagation checker as a generic flexible tool to find potential vulnerable data flows between any taint source and taint sink. The user should be configure sources and sinks in the yaml config file. The default settings will not always be correct. (Whether a file is a tainted source or not, depends a lot on the actual application domain and maybe the deployment.)
This checker warns for every sink unconditionally.

The user can explicitly mark a value as sanitized in these cases:
```
  scanf (" %1023[^\n]", filename);
  if (access(filename,F_OK)){// Verifying user input
    printf("File does not exist\n");
    return -1;
  }
  #ifdef __clang_analyzer__
    csa_mark_sanitized(filename); // Indicating to CSA that filename variable is safe to be used after this point
  #endif
  strcat(cmd, filename);
  system(cmd); // No warning
```
See the example for details in the doc:
https://clang.llvm.org/docs/analyzer/checkers.html#alpha-security-taint-taintpropagation-c-c

Of course it is annoying to add such instructions for tainted integer values, for which the analyzer can perform bounds checking automatically and refute some trivial false positives cases which you also pointed out.

I saw many false positives for malloc(..) calls when tainted buffer was attempted to be copied to another buffer and fixed that in this PR (by removing taint propagation for strlen):
* https://github.com/llvm/llvm-project/pull/66086
For me it refuted most of the FPs on open source projects.

At the end, I agree with you that it is better to remove all the unconditional integer taint sinks from the TaintPropagation checker and handle it in the MallocChecker, CStringChecker to take integer constraints into consideration.
So this PR does that:
* https://github.com/llvm/llvm-project/pull/68607
I will make follow-up patches to re-add these warnings for `malloc()` and `memcpy()` etc. on the MallocChecker, CStringChecker with a more sophisticated handling.

The VLA case you pointed out I fixed in:
* https://github.com/llvm/llvm-project/pull/68140

In case of strings , it is very difficult (in general impossible?) to automatically recognize the programmer's attempt to santize the data.
So I think  `csa_mark_sanitized(string);` is an acceptable solution to get rid of unwanted warnings (or using a triaging tool).

I would add the `TaintPropagation` checker to the `optin` package as an optional security analysis checker which is not enabled by default, because its usage is very application area dependent. Some of the security concious applications would want to create the config file for their sources and sinks and even mark a few FPs, if they can catch a few vulnerabilities.

Does this approach sound acceptable to you?
>From which project did you see the ~300 false reports? 

https://github.com/llvm/llvm-project/pull/67352


More information about the cfe-commits mailing list