[PATCH] More string interceptors: strstr, strcasestr, strcspn, strpbrk

Maria Guseva m.guseva at samsung.com
Tue Apr 21 08:29:08 PDT 2015


================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:279
@@ +278,3 @@
+    COMMON_INTERCEPTOR_READ_STRING_OF_LEN(ctx, s2, len2, r ? len2 + 1 :
+                                   (len2 <= len1 ? len2 : len1 + 1));
+}
----------------
dvyukov wrote:
> m.guseva wrote:
> > dvyukov wrote:
> > > I am not sure such complex computations are necessary. People should not rely on such things.
> > > If anything, I think, the fifth argument should be 1, because if I call strstr("aaaaaaaaaaaa", "bcdef") I can "expect" that the function will read only the first symbol "b" from the second string.
> > Do you mean the following: (ctx, s2, len2, r ? len2 + 1 : (len2 <= len1 ? 1 : len1 + 1)); ?
> > In that case we may not check real read of bigger length from s2 in case there is some not null intersection of s1 and s2.
> > The most accurate result would be if repeat explicitly strstr computations and find that real length. But I've just added non-strict check assuming the worst case.  
> > 
> The worst case is strlen(s2)+1. An implementation is allowed to read that much.
> The non-strict mode should consider the _best_ case if anything. But I would just delete this complexity and use strlen.
> 
>The non-strict mode should consider the _best_ case if anything.
I understand the non-strict mode as ability to check less then full string length when it's possible. It shouldn't rely on best case, it should check at least as much as needed for getting the result. I think it's better to check more bytes than to miss incorrect read. If proposed read check is inaccurate I agree it's better to replace with full length check.

================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:322
@@ +321,3 @@
+    COMMON_INTERCEPTOR_READ_STRING(ctx, s2,
+        s1 && *s1 ? internal_strchr(s2, s1[r]) - s2 + 1 : 0);
+    COMMON_INTERCEPTOR_READ_STRING(ctx, s1, r + 1);
----------------
dvyukov wrote:
> m.guseva wrote:
> > dvyukov wrote:
> > > I don't believe this is necessary. Make it just:
> > > READ_RANGE(s2, REAL(strlen)(s2) + 1);
> > Why not?
> > Indeed we don't need to read the full length of s2. User may expect it in non-strict case.
> Because these are very complex expressions with lots of corner cases and potential off-by-ones.
> User should not expect that the function won't read the whole string. There are not such guarantees in the documentation. Documentation refers to arguments as to strings. Strings are addressable zero-terminated regions of memory.
> 
>User should not expect that the function won't read the whole string. There are not such guarantees in the documentation. 
The "strict_string_checks" flag was introduced exactly cause user may expect such behaviour despite of documentation. Which are corner cases here? Can I just cover them?

http://reviews.llvm.org/D9017

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list