[PATCH] D141389: [DFSAN] Add support for strnlen, strncat, strsep, sscanf and _tolower
Andrew via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Feb 7 08:02:45 PST 2023
browneee added inline comments.
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:221
+ if (flags().strict_data_dependencies) {
+ *ret_label = res ? s_label : 0;
+ } else {
----------------
When `res != NULL`, then `res` is derived from `*s`, not from `s`.
e.g.
```
*ret_label = res ? dfsan_get_label(base) : 0;
```
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:224
+ size_t s_bytes_read = (res ? strlen(res) : strlen(base)) + 1;
+ *ret_label =
+ dfsan_union(dfsan_read_label(base, s_bytes_read),
----------------
I think `dfsan_get_label(base)` should also be a part of this output.
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:241
+ if (res)
+ *ret_origin = s_origin;
+ } else {
----------------
As above, when res != NULL, then res is derived from *s, not from s.
```
if (res)
*ret_origin = dfsan_get_origin(base);
```
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:243
+ } else {
+ if (*ret_label) {
+ size_t s_bytes_read = (res ? strlen(res) : strlen(base)) + 1;
----------------
As above, I think `dfsan_get_origin(base)` should be the first priority, as it is the origin with the most direct value flow.
Note that `dfsan_get_origin(base)` gives the origin id for `base` whereas `dfsan_read_origin_of_first_taint(base, ...)` gives origin id for `*base`.
================
Comment at: compiler-rt/test/dfsan/custom.cpp:1659
+
+ rv = strsep(pp_s, p_delim);
+
----------------
I think this test can call strsep fewer times. It just needs to set all the different labels first.
Different inputs to taint (each one should be a distinct taint label):
* `*p_delim` (with `dfsan_set_label(..., p_delim, strlen(p_delim));`)
* `p_delim` (with `dfsan_set_label(..., &p_delim, sizeof(&p_delim));`)
* `*s` (with `dfsan_set_label(..., s, strlen(s));`)
* `s` aka `p_s` (with `dfsan_set_label(..., &s, sizeof(&s));` or `dfsan_set_label(..., &p_s, sizeof(&p_s));`)
* `pp_s` (with `dfsan_set_label(..., &pp_s, sizeof(&pp_s));`)
This is less than 8, so we can do it all at once (unless you want to test different taint labels on different bytes of the input strings).
Repository:
rCRT Compiler Runtime
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141389/new/
https://reviews.llvm.org/D141389
More information about the cfe-commits
mailing list