[PATCH] D141389: [DFSAN] Add support for strnlen, strncat, strsep, sscanf and _tolower
Andrew via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Mar 31 16:35:43 PDT 2023
browneee added inline comments.
================
Comment at: compiler-rt/lib/dfsan/dfsan_custom.cpp:221
+ if (flags().strict_data_dependencies) {
+ *ret_label = res ? dfsan_read_label(base, sizeof(base)) : 0;
+ } else {
----------------
tkuchta wrote:
> browneee wrote:
> > `base, sizeof(base)` does not make sense - the size does not correspond to the pointer used.
> >
> > Either
> > ```
> > dfsan_read_label(base, sizeof(*base)) // first byte pointed to by base
> > dfsan_read_label(base, strlen(base)) // whole string pointed to by base
> > dfsan_read_label(&base, sizeof(base)) // the base pointer
> > ```
> >
> > In this case I think we want the base pointer.
> >
> > `dfsan_read_label(&base, sizeof(base))` // the base pointer
> > should be equivalent to doing
> > `dfsan_label base_label = dfsan_read_label(s, sizeof(*s))` up at the start of the function, just after we declare base then use `base_label` here.
> >
> > Lets go with the second option to avoid taking the address of a variable.
> >
> > This is semantically equivalent to my first suggestion: `dfsan_get_label(base) == dfsan_read_label(&base, sizeof(base)) == base_label`.
> > Sorry I didn't consider the other constraints (no dfsan_get_label in this file because the pass is not run on this code; avoid taking address of variable) and suggest this in the first place.
> Thank you for your comments! Just to clarify:
> If we have strict data dependencies we would like *ret_label to be related to the taints of the contents of the string, is that correct? Should we use the base_label there too? My understanding is that base_label represents a taint applied to the pointer, not to the data?
>
> In other words, would that be correct to:
> 1) use taints of the string data in the first case (strict dependencies) -> therefore no base_label there as it represents the taint of the pointer
> 2) use the taints of the string data + the taints of the pointer in the second case -> therefore using base_label there
> If we have strict data dependencies we would like *ret_label to be related to the taints of the contents of the string, is that correct?
No, `*ret_label` holds the taint label for the return value, which is a pointer. The return value pointer is derived from `*s`, aka `base`. This is the pointer to the string contents, not the string contents themselves.
> Should we use the base_label there too?
I think we should only use `base_label` here.
> My understanding is that base_label represents a taint applied to the pointer, not to the data?
Correct.
> In other words, would that be correct to:
> 1. use taints of the string data in the first case (strict dependencies) -> therefore no base_label there as it represents the taint of the pointer
No, other way around. We want the base_label, but not the string data.
> 2. use the taints of the string data + the taints of the pointer in the second case -> therefore using base_label there
Yes, we could have both base_label and the taints of the string data in the else case.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141389/new/
https://reviews.llvm.org/D141389
More information about the cfe-commits
mailing list