[PATCH] More string interceptors: strstr, strcasestr, strcspn, strpbrk

Dmitry Vyukov dvyukov at google.com
Mon Apr 20 07:02:36 PDT 2015

Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:279
@@ +278,3 @@
+    COMMON_INTERCEPTOR_READ_STRING_OF_LEN(ctx, s2, len2, r ? len2 + 1 :
+                                   (len2 <= len1 ? len2 : len1 + 1));
m.guseva wrote:
> dvyukov wrote:
> > I am not sure such complex computations are necessary. People should not rely on such things.
> > If anything, I think, the fifth argument should be 1, because if I call strstr("aaaaaaaaaaaa", "bcdef") I can "expect" that the function will read only the first symbol "b" from the second string.
> Do you mean the following: (ctx, s2, len2, r ? len2 + 1 : (len2 <= len1 ? 1 : len1 + 1)); ?
> In that case we may not check real read of bigger length from s2 in case there is some not null intersection of s1 and s2.
> The most accurate result would be if repeat explicitly strstr computations and find that real length. But I've just added non-strict check assuming the worst case.  
The worst case is strlen(s2)+1. An implementation is allowed to read that much.
The non-strict mode should consider the _best_ case if anything. But I would just delete this complexity and use strlen.

Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:322
@@ +321,3 @@
+        s1 && *s1 ? internal_strchr(s2, s1[r]) - s2 + 1 : 0);
m.guseva wrote:
> dvyukov wrote:
> > I don't believe this is necessary. Make it just:
> > READ_RANGE(s2, REAL(strlen)(s2) + 1);
> Why not?
> Indeed we don't need to read the full length of s2. User may expect it in non-strict case.
Because these are very complex expressions with lots of corner cases and potential off-by-ones.
User should not expect that the function won't read the whole string. There are not such guarantees in the documentation. Documentation refers to arguments as to strings. Strings are addressable zero-terminated regions of memory.

Comment at: lib/sanitizer_common/sanitizer_flags.inc:156
@@ -155,1 +155,3 @@
             "If set check that string arguments are properly null-terminated")
+COMMON_FLAG(bool, intercept_strstr, true,
+            "If set, uses custom wrappers for strstr and strcasestr functions "
m.guseva wrote:
> dvyukov wrote:
> > do we need all these additional flags? why?
> It was discussed in initial patch for this http://reviews.llvm.org/D6056#92330: glider suggested to have individual flags. I agree, it's more flexible.
I am not very strong about this

Comment at: lib/sanitizer_common/sanitizer_platform_interceptors.h:58
@@ +57,3 @@
m.guseva wrote:
> dvyukov wrote:
> > this is unused, delete
> It's my bug, SANITIZER_INTERCEPT_STRCASESTR must be used in sanitizer_common_interceptors.inc cause indeed there is no strcasestr on Windows. I'll fix it. 
> The rest is always 1 so I agree it could be removed. I wonder why there are other defines allways equal to 1 in sanitizer_platform_interceptors.h?
I mean delete them because they are unused, not because they are always 1.
I guess we have always 1 defines for flexibility and consistency.



More information about the llvm-commits mailing list