[PATCH] Moar string interceptors: strstr, strcasestr, strcspn, strpbrk

Alexander Potapenko glider at google.com
Wed Nov 12 05:02:57 PST 2014


There are several places where the shadow checks are too aggressive compared to what the glibc implementations of these functions do.
Because the standards are unlikely to define which optimizations are applicable to each function, I think we'd better be more conservative here.

================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:260
@@ +259,3 @@
+    uptr s2_len = REAL(strlen)(s2) + 1;
+    COMMON_INTERCEPTOR_READ_RANGE(ctx, s2, s2_len);
+    r = REAL(strstr)(s1, s2);
----------------
If haystack is shorter than the needle, there's no point in reading the whole contents of needle.
The generic glibc implementation of strstr() indeed makes this optimization (not sure about the SSE-based ones).

================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:302
@@ +301,3 @@
+    uptr s2_len = REAL(strlen)(s2) + 1;
+    COMMON_INTERCEPTOR_READ_RANGE(ctx, s2, s2_len);
+    r = REAL(strcspn)(s1, s2);
----------------
Only in the case when s1 is not an empty string, otherwise there's no point in reading s2.

================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:312
@@ +311,3 @@
+
+INTERCEPTOR(char *, strpbrk, const char *s1, const char *s2) {
+  void *ctx;
----------------
I never realized all these functions crash if we pass NULL as one of the arguments.
Fortunately we don't need to handle that.

================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:320
@@ +319,3 @@
+    uptr s2_len = REAL(strlen)(s2) + 1;
+    COMMON_INTERCEPTOR_READ_RANGE(ctx, s2, s2_len);
+    r = REAL(strpbrk)(s1, s2);
----------------
This is a bit pessimistic. For the following call:
  strpbrk("foobar", "fb")
only the first byte of |s2| will be read.

================
Comment at: lib/sanitizer_common/sanitizer_flags.cc:161
@@ +160,3 @@
+       "wrappers for strstr and strcasestr functions to find more errors.");
+  ParseFlag(str, &f->replace_strcspn, "replace_strcspn", "If set, uses custom "
+       "wrappers for strcspn and strpbrk functions to find more errors.");
----------------
It's not clear why replace_strcspn controls the behavior of strpbrk()
I suggest to either rename the flag to "replace_strcspn_strpbrk" or introduce the "replace_strpbrk" flag.
My dream is to ultimately have a dynamically generated "replace_xyz" flag for almost every interceptor, so the second option is better IMO.

http://reviews.llvm.org/D6056






More information about the llvm-commits mailing list