[PATCH] D141389: [DFSAN] Add support for strnlen, strncat, strsep, sscanf and _tolower
Andrew via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Mar 31 10:43:47 PDT 2023
browneee added a comment.
We're getting really close!
Yes, that build error looks unrelated. Someone should fix it soon.
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:221
+ if (flags().strict_data_dependencies) {
+ *ret_label = res ? dfsan_read_label(base, sizeof(base)) : 0;
+ } else {
----------------
`base, sizeof(base)` does not make sense - the size does not correspond to the pointer used.
Either
```
dfsan_read_label(base, sizeof(*base)) // first byte pointed to by base
dfsan_read_label(base, strlen(base)) // whole string pointed to by base
dfsan_read_label(&base, sizeof(base)) // the base pointer
```
In this case I think we want the base pointer.
`dfsan_read_label(&base, sizeof(base))` // the base pointer
should be equivalent to doing
`dfsan_label base_label = dfsan_read_label(s, sizeof(*s))` up at the start of the function, just after we declare base then use `base_label` here.
Lets go with the second option to avoid taking the address of a variable.
This is semantically equivalent to my first suggestion: `dfsan_get_label(base) == dfsan_read_label(&base, sizeof(base)) == base_label`.
Sorry I didn't consider the other constraints (no dfsan_get_label in this file because the pass is not run on this code; avoid taking address of variable) and suggest this in the first place.
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:224
+ *ret_label =
+ dfsan_union(dfsan_read_label(base, sizeof(base)),
+ dfsan_union(dfsan_read_label(delim, strlen(delim) + 1),
----------------
Also use `base_label` here.
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:240
+ if (res)
+ *ret_origin = dfsan_read_origin_of_first_taint(base, strlen(base));
+ } else {
----------------
`dfsan_get_origin(base) == dfsan_read_origin_of_first_taint(&base, sizeof(base))`
As noted above `base, strlen(base)` is a meaningfully valid pointer, length - but it is not the level of indirection we want here.
I think we want the same solution as above.
`dfsan_origin base_origin = dfsan_read_origin_of_first_taint(s, sizeof(*s))` at the start of the function, just after declaring and assigning base.
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:243
+ if (*ret_label) {
+ dfsan_origin o = dfsan_read_origin_of_first_taint(base, strlen(base));
+ if (o) {
----------------
Also use `base_origin` here.
================
Comment at: compiler-rt/test/dfsan/custom.cpp:1641
+ // taint delim pointer
+ dfsan_set_label(j_label, &p_delim, sizeof(&p_delim));
+ // taint the string data bytes
----------------
remove the &
It still works because `sizeof(pointer_var) == sizeof(&pointer_var)`, but it doesn't logically match up like it should.
(Sorry, this one is my fault - I made this mistake giving an example in an earlier comment.)
================
Comment at: compiler-rt/test/dfsan/custom.cpp:1645
+ // taint the string pointer
+ dfsan_set_label(m_label, &p_s, sizeof(&p_s));
+
----------------
remove & in sizeof
================
Comment at: compiler-rt/test/dfsan/custom.cpp:1650
+#ifdef STRICT_DATA_DEPENDENCIES
+ ASSERT_LABEL(rv, k_label);
+ ASSERT_READ_LABEL(rv, strlen(rv), k_label);
----------------
The value `rv` has a data flow from the the string pointer, not the string bytes.
It should have `m_label` not `k_label`.
This is related to line 221 (using `base` instead of `&base`) in the other file.
================
Comment at: compiler-rt/test/dfsan/custom.cpp:1654
+ ASSERT_LABEL(rv, dfsan_union(dfsan_union(i_label, j_label), k_label));
+ ASSERT_INIT_ORIGIN_EQ_ORIGIN(&rv, *s);
+#endif
----------------
rv is a pointer, and I think it's origin should match the pointer s, not the bytes in the string.
I think this should be:
`ASSERT_INIT_ORIGIN_EQ_ORIGIN(&rv, s);`
This is related to line 240 (`*ret_origin = dfsan_read_origin_of_first_taint(base, strlen(base));`) in the other file being the wrong level of indirection.
(Side note: the existing ASSERT_INIT_ORIGIN_EQ_ORIGIN macro feels a bit odd in that the arguments are at different levels of indirection - but not something to fix as part of this change)
================
Comment at: compiler-rt/test/dfsan/custom.cpp:1660
+ char **pp_s_base = pp_s;
+ dfsan_set_label(n_label, &pp_s, sizeof(&pp_s));
+
----------------
remove & in sizeof
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141389/new/
https://reviews.llvm.org/D141389
More information about the cfe-commits
mailing list