[PATCH] D145229: [analyzer] Improve the documentation of the alpha.security.taint.TaintPropagation checker
DonĂ¡t Nagy via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Mar 3 08:06:17 PST 2023
donat.nagy added inline comments.
================
Comment at: clang/docs/analyzer/checkers.rst:78-80
+The ``SuppressAddressSpaces`` option suppresses
warnings for null dereferences of all pointers with address spaces. You can
disable this behavior with the option
----------------
Why is this paragraph (and the one above it) wrapped inconsistently? If we are touching these docs, perhaps we could re-wrap them to e.g 80 characters / line.
================
Comment at: clang/docs/analyzer/checkers.rst:2385
- network originating data
+ - files or standard input
----------------
This old word choice is awkward, consider replacing it with e.g. "data from the network".
================
Comment at: clang/docs/analyzer/checkers.rst:2388
- environment variables
- database originating data
----------------
Why not just "databases"?
================
Comment at: clang/docs/analyzer/checkers.rst:2404
+ }
+ strcat(cmd, filename);
+ system(cmd); // Warning: Untrusted data is passed to a system call
----------------
If the filename is too long (more than 1014 characters), this is a buffer overflow. I admit that having a secondary unrelated vulnerability makes the example more realistic :), but I think we should still avoid it. (This also appears in other variants of the example code, including the "No vulnerability anymore" one.)
================
Comment at: clang/docs/analyzer/checkers.rst:2426
+
+ if (access(filename,F_OK)){//sanitizing user input
+ printf("File does not exist\n");
----------------
Nitpick: the comment formatting is inconsistent: for example, this is lowercase while most others start with an uppercase letter, or half of the comments have a space after the `//` while the others don't.
================
Comment at: clang/docs/analyzer/checkers.rst:2457-2461
+ if (access(filename,F_OK)){//sanitizing user input
+ printf("File does not exist\n");
+ return -1;
+ }
+ csa_sanitize(filename); // Indicating to CSA that filename variable is safe to be used after this point
----------------
Separating the actual sanitization and the function that's magically recognized by the taint checker doesn't seem to be a good design pattern. Here `csa_sanitize()` is just a synonym for the "silence this checker here" marker, which is //very// confusing, because if someone is not familiar with this locally introduced no-op function, they will think that it's performing actual sanitization! At the very least we should rename this magical no-op to `csa_mark_sanitized()` or something similar.
The root issue is that in this example we would like to use a verifier function (that determines whether the tainted data is safe) instead of a sanitizer function (that can convert //any// tainted data into safe data) and our taint handling engine is not prepared to handle conditional Filter effects like "this function removes taint from its first argument //if its return value is true//".
I think it would be much better to switch to a different example where the "natural" solution is more aligned with the limited toolbox provided by our taint framework (i.e. it's possible define a filter function that actually removes problematic parts of the untrusted input).
================
Comment at: clang/docs/analyzer/checkers.rst:2505
-There are built-in sources, propagations and sinks defined in code inside ``GenericTaintChecker``.
-These operations are handled even if no external taint configuration is provided.
+There are built-in sources, propagations and sinks even if no external taint configuration is provided.
----------------
Perhaps explicitly mention that there are no built-in filters.
================
Comment at: clang/docs/analyzer/checkers.rst:2535
+
+* The taintedness property is not propagated through function calls which are unkown (or too complex) to the analyzer, unless there is a specific
+propagation rule built-in to the checker or given in the YAML configuration file. This causes potential true positive findings to be lost.
----------------
Spellcheck: "unknown"
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D145229/new/
https://reviews.llvm.org/D145229
More information about the cfe-commits
mailing list