[PATCH] D141389: [DFSAN] Add support for strnlen, strncat, strsep, sscanf and _tolower

Andrew via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 10 18:10:37 PST 2023


browneee requested changes to this revision.
browneee added a comment.
This revision now requires changes to proceed.

Hello! Thank you for the patch.

I think your changes for `__dfsw_strsep` are conflating the taint value of a pointer with the taint value of the value(s) it points to.
This is subtle - I believe I made the same mistake when I first worked with the DFSan code :)

With this in mind, please check the other functions.

Since this is a large patch and string manipulations need careful review, please send separate changes for each group of functions (e.g. first patch is `__dfsw_strsep` + `__dfso_strsep` + tests for this code).



================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:213
+  char *res = strsep(s, delim);
+  s_label = dfsan_read_label(base, strlen(base));
+  if (res && (res != base)) {
----------------
The `s_label` represents the taint label for `s` (the pointer).

This line would clobber the taint label of the pointer (`s`) with a taint label from `s[0][0..n]`.

I think this line should be deleted.


================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:214
+  s_label = dfsan_read_label(base, strlen(base));
+  if (res && (res != base)) {
+    char *token_start = res;
----------------
I think `res && (res != base)` is never true.

Checking an implementation of strsep (http://git.musl-libc.org/cgit/musl/tree/src/string/strsep.c)
`res` can either be `NULL` or the same value as `base`.

I think this line should be `(res != *s)`. This is different because `*s` may be changed by the call to `strsep` just above.


================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:239
+  char *base = *s;
+  s_origin = dfsan_read_origin_of_first_taint(base, strlen(base) + 1);
+  char *res = __dfsw_strsep(s, delim, s_label, delim_label, ret_label);
----------------
Delete this line.

Like above, `s_origin` represents the origin value for the pointer `s`, not the origin value for for data in `base[0..n]`.


================
Comment at: compiler-rt/test/dfsan/custom.cpp:1693
+
+  dfsan_set_label(n_label, p_delim, sizeof(p_delim));
+
----------------
Also
dfsan_set_label(<some_different_label_x>, &p_delim, sizeof(&p_delim));


================
Comment at: compiler-rt/test/dfsan/custom.cpp:1698
+#ifdef STRICT_DATA_DEPENDENCIES
+  ASSERT_ZERO_LABEL(rv);
+#else
----------------
Also
ASSERT_READ_ZERO_LABEL(rv, strlen(rv))


================
Comment at: compiler-rt/test/dfsan/custom.cpp:1700
+#else
+  ASSERT_LABEL(rv, n_label);
+  ASSERT_INIT_ORIGIN_EQ_ORIGIN(&rv, *p_delim);
----------------
ASSERT_LABEL(rv, dfsan_union(some_different_label_x, n_label));


================
Comment at: compiler-rt/test/dfsan/custom.cpp:1704
+
+  dfsan_set_label(m_label, p_s, sizeof(p_s));
+  rv = strsep(&p_s, p_delim);
----------------
Also
dfsan_set_label(<some_different_label_y>, &p_s, sizeof(&p_s));


================
Comment at: compiler-rt/test/dfsan/custom.cpp:1709
+#ifdef STRICT_DATA_DEPENDENCIES
+  ASSERT_LABEL(rv, m_label);
+  ASSERT_INIT_ORIGIN_EQ_ORIGIN(&rv, base[6]);
----------------
Change this to

ASSERT_READ_LABEL(rv, strlen(rv), m_label);
ASSERT_LABEL(rv, some_different_label_y);


================
Comment at: compiler-rt/test/dfsan/custom.cpp:1712
+#else
+  ASSERT_LABEL(rv, dfsan_union(m_label, n_label));
+  ASSERT_INIT_ORIGIN_EQ_ORIGIN(&rv, base[6]);
----------------
some_different_label_y would also be set on rv.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141389/new/

https://reviews.llvm.org/D141389



More information about the cfe-commits mailing list