[PATCH] D145229: [analyzer] Improve the documentation of the alpha.security.taint.TaintPropagation checker

DonĂ¡t Nagy via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 3 08:06:17 PST 2023


donat.nagy added inline comments.


================
Comment at: clang/docs/analyzer/checkers.rst:78-80
+The ``SuppressAddressSpaces`` option suppresses
 warnings for null dereferences of all pointers with address spaces. You can
 disable this behavior with the option
----------------
Why is this paragraph (and the one above it) wrapped inconsistently? If we are touching these docs, perhaps we could re-wrap them to e.g 80 characters / line.


================
Comment at: clang/docs/analyzer/checkers.rst:2385
 
   - network originating data
+  - files or standard input
----------------
This old word choice is awkward, consider replacing it with e.g. "data from the network".


================
Comment at: clang/docs/analyzer/checkers.rst:2388
   - environment variables
   - database originating data
 
----------------
Why not just "databases"?


================
Comment at: clang/docs/analyzer/checkers.rst:2404
+  }
+  strcat(cmd, filename);
+  system(cmd); // Warning: Untrusted data is passed to a system call
----------------
If the filename is too long (more than 1014 characters), this is a buffer overflow. I admit that having a secondary unrelated vulnerability makes the example more realistic :), but I think we should still avoid it. (This also appears in other variants of the example code, including the "No vulnerability anymore" one.)


================
Comment at: clang/docs/analyzer/checkers.rst:2426
+
+  if (access(filename,F_OK)){//sanitizing user input
+    printf("File does not exist\n");
----------------
Nitpick: the comment formatting is inconsistent: for example, this is lowercase while most others start with an uppercase letter, or half of the comments have a space after the `//` while the others don't.


================
Comment at: clang/docs/analyzer/checkers.rst:2457-2461
+  if (access(filename,F_OK)){//sanitizing user input
+    printf("File does not exist\n");
+    return -1;
+  }
+  csa_sanitize(filename); // Indicating to CSA that filename variable is safe to be used after this point
----------------
Separating the actual sanitization and the function that's magically recognized by the taint checker doesn't seem to be a good design pattern. Here `csa_sanitize()` is just a synonym for the "silence this checker here" marker, which is //very// confusing, because if someone is not familiar with this locally introduced no-op function, they will think that it's performing actual sanitization! At the very least we should rename this magical no-op to `csa_mark_sanitized()` or something similar.

The root issue is that in this example we would like to use a verifier function (that determines whether the tainted data is safe) instead of a sanitizer function (that can convert //any// tainted data into safe data) and our taint handling engine is not prepared to handle conditional Filter effects like "this function removes taint from its first argument //if its return value is true//".

I think it would be much better to switch to a different example where the "natural" solution is more aligned with the limited toolbox provided by our taint framework (i.e. it's possible define a filter function that actually removes problematic parts of the untrusted input).


================
Comment at: clang/docs/analyzer/checkers.rst:2505
 
-There are built-in sources, propagations and sinks defined in code inside ``GenericTaintChecker``.
-These operations are handled even if no external taint configuration is provided.
+There are built-in sources, propagations and sinks even if no external taint configuration is provided.
 
----------------
Perhaps explicitly mention that there are no built-in filters.


================
Comment at: clang/docs/analyzer/checkers.rst:2535
+
+* The taintedness property is not propagated through function calls which are unkown (or too complex) to the analyzer, unless there is a specific
+propagation rule built-in to the checker or given in the YAML configuration file. This causes potential true positive findings to be lost.
----------------
Spellcheck: "unknown"


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145229/new/

https://reviews.llvm.org/D145229



More information about the cfe-commits mailing list