[compiler-rt] r284901 - [sanitizers] support strict_string_checks for strncmp

Kostya Serebryany via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 11 17:21:04 PST 2016


btw, Nick, do you have a test where this fails?

On Tue, Nov 8, 2016 at 5:51 PM, Nick Lewycky <nlewycky at google.com> wrote:

> On 8 November 2016 at 17:20, Kostya Serebryany <kcc at google.com> wrote:
>
>>
>>
>>
>>
>> On Tue, Nov 8, 2016 at 2:30 PM, Nick Lewycky <nlewycky at google.com> wrote:
>>
>>> On 21 October 2016 at 16:52, Kostya Serebryany via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>>
>>>> Author: kcc
>>>> Date: Fri Oct 21 18:52:26 2016
>>>> New Revision: 284901
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=284901&view=rev
>>>> Log:
>>>> [sanitizers] support strict_string_checks for strncmp
>>>>
>>>> Added:
>>>>     compiler-rt/trunk/test/asan/TestCases/strncmp_strict.c
>>>> Modified:
>>>>     compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_inte
>>>> rceptors.inc
>>>>
>>>> Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_inte
>>>> rceptors.inc
>>>> URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sa
>>>> nitizer_common/sanitizer_common_interceptors.inc?rev=284901&
>>>> r1=284900&r2=284901&view=diff
>>>> ============================================================
>>>> ==================
>>>> --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc
>>>> (original)
>>>> +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc
>>>> Fri Oct 21 18:52:26 2016
>>>> @@ -304,8 +304,8 @@ INTERCEPTOR(int, strncmp, const char *s1
>>>>      c2 = (unsigned char)s2[i];
>>>>      if (c1 != c2 || c1 == '\0') break;
>>>>    }
>>>> -  COMMON_INTERCEPTOR_READ_RANGE(ctx, s1, Min(i + 1, size));
>>>> -  COMMON_INTERCEPTOR_READ_RANGE(ctx, s2, Min(i + 1, size));
>>>> +  COMMON_INTERCEPTOR_READ_STRING(ctx, s1, Min(i + 1, size));
>>>> +  COMMON_INTERCEPTOR_READ_STRING(ctx, s2, Min(i + 1, size));
>>>>
>>>
>>> I think this change is wrong, but please double check my analysis. I
>>> think arguments to strncmp don't need to be NUL terminated like arguments
>>> to strcmp do.
>>>
>>> C99 7.1.1 defines: "A string is a contiguous sequence of characters
>>> terminated by and including the first null character." Then strcmp is
>>> defined as a function which "compares the strings" in 7.21.4.2 while
>>> strncmp is defined to operate over two "possibly null-terminated array"s in
>>> 7.21.4.4. The bug manifests in COMMON_INTERCEPTOR_READ_STRING calling
>>> REAL(strlen) on these arguments which aren't guaranteed to be NUL
>>> terminated.
>>>
>>> What do you think?
>>>
>>
>> Mmmm. Are these parts equivalent in C99 and C++ standards?
>> At least here http://www.cplusplus.com/reference/cstring/strncmp/ I
>> read:
>> str1C string to be compared.str2C string to be compared.
>>
> I think they're simplifying. But cppreference.com also disagrees with me:
> http://en.cppreference.com/w/cpp/string/byte/strncmp . They have separate
> entries for strncmp in C and for C++ and they differ, for C++ it says they
> compare two null-terminated strings, while in C it says they compare two
> possibly null-terminated strings.
>
> The C++ standard doesn't define either strcmp or strncmp. I think the
> normative text is in 21.8 [c.strings]:
> "1 Tables 74, 75, 76, 77, 78, and 79 describe headers <cctype>, <cwctype>,
> <cstring>, <cwchar>, <cstdlib>
> (character conversions), and <cuchar>, respectively."
>
> The tables list all the various functions, table 76 includes both strcmp
> and strncmp.
>
> "2 The contents of these headers shall be the same as the Standard C
> Library headers <ctype.h>, <wctype.h>,
> <string.h>, <wchar.h>, and <stdlib.h> and the C Unicode TR header
> <uchar.h>, respectively, with the
> following modifications:"
>
> The modifications that follow affect types char16_t, char32_t and wchar_t
> and functions strchr, strpbrk, strrchr, strstr, memchr, wcschr, wcspbrk,
> wcsrchr, wcsstr, wmemchr, strerror, strtok, as well as anything that takes
> an mbstate_t*.
>
> I can't see anything else that would make the semantics different between
> C and C++.
>
> I probably still need to replace this with something like
>> COMMON_INTERCEPTOR_READ_RANGE((ctx), (s),                       \
>>       common_flags()->strict_string_checks ? size : (Min(i + 1, size)) )
>> to avoid calling strlen
>>
>
> I don't think that's quite right either. strncmp will read up to the first
> NUL at which point it stops, or it reads up to size. That makes it:
>
>   unsigned char c1 = 0, c2 = 0;
>   uptr i;
>   for (i = 0; i < size; i++) {
>     c1 = (unsigned char)s1[i];
>     c2 = (unsigned char)s2[i];
>     if (c1 != c2 || c1 == '\0') {
>       ++i;  // include trailing NUL
>       break;
>     }
>   }
>   assert(i <= size);
>   COMMON_INTERCEPTOR_READ_RANGE((ctx), (s1), i)
>   COMMON_INTERCEPTOR_READ_RANGE((ctx), (s2), i)
>
> and there's no need to check for strict string handling?
>
> Nick
>
> Nick
>>>
>>>    int result = CharCmpX(c1, c2);
>>>>    CALL_WEAK_INTERCEPTOR_HOOK(__sanitizer_weak_hook_strncmp,
>>>> GET_CALLER_PC(), s1,
>>>>                               s2, size, result);
>>>>
>>>> Added: compiler-rt/trunk/test/asan/TestCases/strncmp_strict.c
>>>> URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/a
>>>> san/TestCases/strncmp_strict.c?rev=284901&view=auto
>>>> ============================================================
>>>> ==================
>>>> --- compiler-rt/trunk/test/asan/TestCases/strncmp_strict.c (added)
>>>> +++ compiler-rt/trunk/test/asan/TestCases/strncmp_strict.c Fri Oct 21
>>>> 18:52:26 2016
>>>> @@ -0,0 +1,26 @@
>>>> +// Test strict_string_checks option in strncmp function
>>>> +// RUN: %clang_asan %s -o %t && %run %t 2>&1
>>>> +// RUN: %env_asan_opts=strict_string_checks=false %run %t 2>&1
>>>> +// RUN: %env_asan_opts=strict_string_checks=true not %run %t 2>&1 |
>>>> FileCheck %s
>>>> +
>>>> +#include <assert.h>
>>>> +#include <stdlib.h>
>>>> +#include <string.h>
>>>> +
>>>> +int main(int argc, char **argv) {
>>>> +  size_t size = 100;
>>>> +  char fill = 'o';
>>>> +  char *s1 = (char*)malloc(size);
>>>> +  memset(s1, fill, size);
>>>> +  char *s2 = (char*)malloc(size);
>>>> +  memset(s2, fill, size);
>>>> +  s1[size - 1] = 'z';
>>>> +  s2[size - 1] = 'x';
>>>> +  int r = strncmp(s1, s2, size + 1);
>>>> +  // CHECK: {{.*ERROR: AddressSanitizer: heap-buffer-overflow on
>>>> address}}
>>>> +  // CHECK: READ of size 101
>>>> +  assert(r == 1);
>>>> +  free(s1);
>>>> +  free(s2);
>>>> +  return 0;
>>>> +}
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161111/3232aa1a/attachment.html>


More information about the llvm-commits mailing list