[PATCH] D30384: [asan] Add an interceptor for strtok
Yury Gribov via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 15 14:56:22 PDT 2017
ygribov added inline comments.
================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:494
+ uptr del_length = REAL(strlen)(delimiters);
+ COMMON_INTERCEPTOR_READ_STRING_OF_LEN(ctx, delimiters, del_length, 1);
+ return result;
----------------
alekseyshl wrote:
> mrigger wrote:
> > alekseyshl wrote:
> > > mrigger wrote:
> > > > alekseyshl wrote:
> > > > > First, it should be ..., del_length, del_length + 1); and second, it does not make sense to use COMMON_INTERCEPTOR_READ_STRING_OF_LEN here,
> > > > >
> > > > > COMMON_INTERCEPTOR_READ_RANGE(ctx, delimiters, REAL(strlen)(delimiters) + 1);
> > > > >
> > > > > would be enough.
> > > > >
> > > > I implemented it like you suggested in my first revision. The current implementation is based on @ygribov's feedback and his explanations about `strict_strings`. Did I understand them wrong?
> > > You're explicitly calling strlen on that string, it will look for the \0 anyway. Check what COMMON_INTERCEPTOR_READ_STRING_OF_LEN is doing, substitute your arguments.
> > >
> > > Your last argument being just 1 should definitely be fixed, now you're checking just the first char, right?
> > Note, that we use the not-intercepted `strlen`. Following, an out-of-bounds access might not result in an error here.
> >
> > Yes, we only check the first delimiter (for `strict_string_checks=0`), as @ygribov suggested, if I understood him right. @ygribov, can you please comment to clear things up?
> >
> > Here is a demonstration of the behavior on an example:
> > ```
> > #include <stdio.h>
> > #include <string.h>
> >
> > int main() {
> > char buf[] = "abcb";
> > char del[] = {'b'}; // missing NULL terminator
> > char* result = strtok(buf, del);
> > printf("%s\n", result);
> > }
> > ```
> > Executing the program with `ASAN_OPTIONS=strict_string_checks=1 ./a.out` lets ASan output an error. However, executing with `ASAN_OPTIONS=strict_string_checks=0 ./a.out` does not result in an error.
> >
> > Here is also the documentation for `strict_string_checks`:
> > ```
> > strict_string_checks
> > - If set check that string arguments are properly null-terminated
> > ```
> >
> > I'm happy to change it, but I want to be sure that we implement the expected behavior.
> >
> >
> The code I proposed uses non-intercepted strlen as well:
>
> COMMON_INTERCEPTOR_READ_RANGE(ctx, delimiters, REAL(strlen)(delimiters) + 1);
>
> The problem I'm trying to point out is not about the first delimiter (that's fine, but it's about the other COMMON_INTERCEPTOR_READ_STRING_OF_LEN use, the one above), it's that you pass n=1 to your second macro, this one:
>
> COMMON_INTERCEPTOR_READ_STRING_OF_LEN(ctx, delimiters, del_length, 1);
>
> which means "1 char". I believe, it was not your intention.
Sorry guys, I was out. Frankly I agree with @alekseyshl in that we shouldn't unconditionally call `strlen` on `str` and `delimeters`. They are not necessarily null-terminated so by calling `REAL(strlen)` when strict_strings is off, you risk triggering a runtime error.
See my suggestions below for how to change the code to avoid strlens.
================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:500
+INTERCEPTOR(char*, strtok, char *str, const char *delimiters) {
+ void *ctx;
+ COMMON_INTERCEPTOR_ENTER(ctx, strtok, str, delimiters);
----------------
Bind star to `ctx`?
================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:502
+ COMMON_INTERCEPTOR_ENTER(ctx, strtok, str, delimiters);
+ if (common_flags()->intercept_strtok) {
+ char* result;
----------------
I'd suggest to handle `!intercept_strtok` case here as this will allow to reduce function nesting. It's minor though.
================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:508
+ // static buffer for subsequent invocations.
+ result = REAL(strtok)(str, delimiters);
+ } else {
----------------
`Strtok` will return NULL or zero-terminated string so you should be able to do plain `COMMON_INTERCEPTOR_READ_RANGE` on `str` (probably only when strict_strings is off as otherwise `str` has been completely checked on the first call).
================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:510
+ } else {
+ uptr str_length = REAL(strlen)(str);
+ result = REAL(strtok)(str, delimiters);
----------------
I agree with Alex - we shouldn't call `strlen` here, because if `str` isn't zero-terminated we risk to trigger runtime error. The best we can do is
COMMON_INTERCEPTOR_READ_STRING(ctx, str, 1);
Note that 1 is suboptimal here - we could have used strspn to find the first occurence of delimeters and use that to better estimate the length.
================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:515
+ COMMON_INTERCEPTOR_READ_STRING_OF_LEN(ctx, str, str_length,
+ result_length + 1);
+ }
----------------
Same to line 508
COMMON_INTERCEPTOR_READ_RANGE(ctx, result, REAL(strlen)(result) + 1);
?
================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:516
+ result_length + 1);
+ }
+ }
----------------
I think else-branch shouldn't be neglected - you can safely do
COMMON_INTERCEPTOR_READ_RANGE(ctx, str, REAL(strlen) + 1)
(we know that none of delimeters has been found so `str` must have been scanned to the terminating 0).
================
Comment at: lib/sanitizer_common/sanitizer_common_interceptors.inc:519
+ uptr del_length = REAL(strlen)(delimiters);
+ COMMON_INTERCEPTOR_READ_STRING_OF_LEN(ctx, delimiters, del_length, 1);
+ return result;
----------------
Same issue with `strlen` here. Just do
COMMON_INTERCEPTOR_READ_STRING(ctx, delimeters, 1);
(1 is again suboptimal).
https://reviews.llvm.org/D30384
More information about the llvm-commits
mailing list