[PATCH] D141389: [DFSAN] Add support for strnlen, strncat, strsep, sscanf and _tolower

Andrew via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 17 00:35:10 PST 2023


browneee added inline comments.


================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:213
+  char *res = strsep(s, delim);
+  s_label = dfsan_read_label(base, strlen(base));
+  if (res && (res != base)) {
----------------
tkuchta wrote:
> browneee wrote:
> > The `s_label` represents the taint label for `s` (the pointer).
> > 
> > This line would clobber the taint label of the pointer (`s`) with a taint label from `s[0][0..n]`.
> > 
> > I think this line should be deleted.
> Agree, s_label represents the taint associated with the **s pointer. However I am now wondering if that is the taint wich we would like to return.
> For example, if we have
> if (flags().strict_data_dependencies) {
>     *ret_label = res ? s_label : 0;
> 
> We would taint the return value with the value of the pointer, not the data. It means that if we operate on a string for which the characters are tainted, but the pointer itself isn't, we are likely going to return label 0. My understanding was that we are more concerned with the taint of the data, not the pointer, am I missing something?
> 
Yes, we are usually more concerned with the taint of the data, not the pointer.

With strict dependencies:
// If the input pointer is tainted, the output pointer would be tainted (because it is derived from the input pointer - maybe the same value).
taint(s[0]) == dfsan_read_label(s, sizeof(s)) ====> taint(ret) == ret_label[0]

// If the input data is tainted, the output data would be tainted (because it is derived from the input data).
taint(s[0][0]) == MEM_TO_SHADOW(s[0])[0] ====> taint(ret[0]) == MEM_TO_SHADOW(ret)[0]

Because s[0] == ret  (or ret==null), (for the non-null case) the output shadow bytes are the same bytes as input shadow bytes and so these taint labels for the string data in shadow memory do not need to be explicitly propagated in this function. 

I think the only case actually changing/copying string data is writing a delimiter byte to NULL, which you handled.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141389/new/

https://reviews.llvm.org/D141389



More information about the cfe-commits mailing list