[PATCH] More string interceptors: strstr, strcasestr, strcspn, strpbrk
dvyukov at google.com
Wed Apr 22 00:38:05 PDT 2015
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:322
@@ +321,3 @@
+ COMMON_INTERCEPTOR_READ_STRING(ctx, s2,
+ s1 && *s1 ? internal_strchr(s2, s1[r]) - s2 + 1 : 0);
+ COMMON_INTERCEPTOR_READ_STRING(ctx, s1, r + 1);
> m.guseva wrote:
> > dvyukov wrote:
> > > m.guseva wrote:
> > > > dvyukov wrote:
> > > > > I don't believe this is necessary. Make it just:
> > > > > READ_RANGE(s2, REAL(strlen)(s2) + 1);
> > > > Why not?
> > > > Indeed we don't need to read the full length of s2. User may expect it in non-strict case.
> > > Because these are very complex expressions with lots of corner cases and potential off-by-ones.
> > > User should not expect that the function won't read the whole string. There are not such guarantees in the documentation. Documentation refers to arguments as to strings. Strings are addressable zero-terminated regions of memory.
> > >
> > >User should not expect that the function won't read the whole string. There are not such guarantees in the documentation.
> > The "strict_string_checks" flag was introduced exactly cause user may expect such behaviour despite of documentation. Which are corner cases here? Can I just cover them?
> I guess Dmitry means that there is a tradeoff between functionality and code complexity. And he'd prefer to drop overly complex checks in favor of simplicity.
Yes, what Yury said. "r ? internal_strchr(s2, *r) - s2 + 1 : *s1 ? REAL(strlen)(s2) + 1 : 0" is complex without providing any obvious benefit.
If anything, non-strict mode check at _most_ (rather than at least) as much as needed for getting the result. But what is required to get the result? It depends on the implementation of the function in libc.
For example, I, as a user, can argue than strstr("aaaaaaaaaaaa", "bcdef") must read no more than 1 byte from the second string. While your non-strict implementation checks more, thus will cause a crash which looks non-legitimate to me.
The non-strict checks were introduced based on real cases that we observed and were unable to fix. Adding them proactively looks wrong to me. If a user hits such a case, then the first thing he should try is to fix the code (formally it is illegal). This should cover majority of the cases. If fixing the code is not possible, then he can use the flag to disable the interceptor. Meanwhile we can figure out whether we want to relax checking in the interceptor or not.
More information about the llvm-commits