[PATCH] D141389: [DFSAN] Add support for strnlen, strncat, strsep, sscanf and _tolower

Tomasz Kuchta via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 31 14:33:08 PDT 2023


tkuchta added inline comments.


================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:221
+  if (flags().strict_data_dependencies) {
+    *ret_label = res ? dfsan_read_label(base, sizeof(base)) : 0;
+  } else {
----------------
browneee wrote:
> `base, sizeof(base)` does not make sense - the size does not correspond to the pointer used.
> 
> Either
> ```
> dfsan_read_label(base, sizeof(*base))   // first byte pointed to by base
> dfsan_read_label(base, strlen(base))     // whole string pointed to by base
> dfsan_read_label(&base, sizeof(base))  // the base pointer
> ```
> 
> In this case I think we want the base pointer.
> 
> `dfsan_read_label(&base, sizeof(base))`  // the base pointer
> should be equivalent to doing
> `dfsan_label base_label = dfsan_read_label(s, sizeof(*s))`  up at the start of the function, just after we declare base then use `base_label` here.
> 
> Lets go with the second option to avoid taking the address of a variable.
> 
> This is semantically equivalent to my first suggestion: `dfsan_get_label(base) == dfsan_read_label(&base, sizeof(base)) == base_label`.
> Sorry I didn't consider the other constraints (no dfsan_get_label in this file because the pass is not run on this code; avoid taking address of variable) and suggest this in the first place.
Thank you for your comments! Just to clarify:
If we have strict data dependencies we would like *ret_label to be related to the taints of the contents of the string, is that correct? Should we use the base_label there too? My understanding is that base_label represents a taint applied to the pointer, not to the data? 

In other words, would that be correct to:
1) use taints of the string data in the first case (strict dependencies) -> therefore no base_label there as it represents the taint of the pointer
2) use the taints of the string data + the taints of the pointer in the second case -> therefore using base_label there


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141389/new/

https://reviews.llvm.org/D141389



More information about the cfe-commits mailing list