<div dir="ltr"><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Nov 11, 2016 at 5:21 PM, Kostya Serebryany <span dir="ltr"><<a href="mailto:kcc@google.com" target="_blank">kcc@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">btw, Nick, do you have a test where this fails? </div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Nov 8, 2016 at 5:51 PM, Nick Lewycky <span dir="ltr"><<a href="mailto:nlewycky@google.com" target="_blank">nlewycky@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="m_-6893603495633923300h5">On 8 November 2016 at 17:20, Kostya Serebryany <span dir="ltr"><<a href="mailto:kcc@google.com" target="_blank">kcc@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><div><br></div><div><br></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="m_-6893603495633923300m_5717655743882326564gmail-h5">On Tue, Nov 8, 2016 at 2:30 PM, Nick Lewycky <span dir="ltr"><<a href="mailto:nlewycky@google.com" target="_blank">nlewycky@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="m_-6893603495633923300m_5717655743882326564gmail-m_-4476846632077403236gmail-">On 21 October 2016 at 16:52, Kostya Serebryany via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br></span><span class="m_-6893603495633923300m_5717655743882326564gmail-m_-4476846632077403236gmail-"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Author: kcc<br>
Date: Fri Oct 21 18:52:26 2016<br>
New Revision: 284901<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=284901&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject?rev=284901&view=rev</a><br>
Log:<br>
[sanitizers] support strict_string_checks for strncmp<br>
<br>
Added:<br>
compiler-rt/trunk/test/asan/Te<wbr>stCases/strncmp_strict.c<br>
Modified:<br>
compiler-rt/trunk/lib/sanitize<wbr>r_common/sanitizer_common_inte<wbr>rceptors.inc<br>
<br>
Modified: compiler-rt/trunk/lib/sanitize<wbr>r_common/sanitizer_common_inte<wbr>rceptors.inc<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc?rev=284901&r1=284900&r2=284901&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/compiler-rt/trunk/lib/sa<wbr>nitizer_common/sanitizer_commo<wbr>n_interceptors.inc?rev=284901&<wbr>r1=284900&r2=284901&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- compiler-rt/trunk/lib/sanitize<wbr>r_common/sanitizer_common_inte<wbr>rceptors.inc (original)<br>
+++ compiler-rt/trunk/lib/sanitize<wbr>r_common/sanitizer_common_inte<wbr>rceptors.inc Fri Oct 21 18:52:26 2016<br>
@@ -304,8 +304,8 @@ INTERCEPTOR(int, strncmp, const char *s1<br>
c2 = (unsigned char)s2[i];<br>
if (c1 != c2 || c1 == '\0') break;<br>
}<br>
- COMMON_INTERCEPTOR_READ_RANGE(<wbr>ctx, s1, Min(i + 1, size));<br>
- COMMON_INTERCEPTOR_READ_RANGE(<wbr>ctx, s2, Min(i + 1, size));<br>
+ COMMON_INTERCEPTOR_READ_STRING<wbr>(ctx, s1, Min(i + 1, size));<br>
+ COMMON_INTERCEPTOR_READ_STRING<wbr>(ctx, s2, Min(i + 1, size));<br></blockquote><div><br></div></span><div>I think this change is wrong, but please double check my analysis. I think arguments to strncmp don't need to be NUL terminated like arguments to strcmp do.</div><div><br></div><div>C99 7.1.1 defines: "A string is a contiguous sequence of characters terminated by and including the first null character." Then strcmp is defined as a function which "compares the strings" in 7.21.4.2 while strncmp is defined to operate over two "possibly null-terminated array"s in 7.21.4.4. The bug manifests in COMMON_INTERCEPTOR_READ_STRING calling REAL(strlen) on these arguments which aren't guaranteed to be NUL terminated.</div><div><br></div><div>What do you think?</div></div></div></div></blockquote><div><br></div></div></div><div>Mmmm. Are these parts equivalent in C99 and C++ standards? </div><div>At least here <a href="http://www.cplusplus.com/reference/cstring/strncmp/" target="_blank">http://www.cplusplus.com/<wbr>reference/cstring/strncmp/</a> I read: </div><div><dt style="background-color:rgb(240,240,240);font-family:monospace;margin-top:5px;color:rgb(0,0,0);font-size:12px">str1</dt><dd style="margin-bottom:10px;color:rgb(0,0,0);font-family:verdana,arial,helvetica,sans-serif;font-size:12px">C string to be compared.</dd><dt style="background-color:rgb(240,240,240);font-family:monospace;margin-top:5px;color:rgb(0,0,0);font-size:12px">str2</dt><dd style="margin-bottom:10px;color:rgb(0,0,0);font-family:verdana,arial,helvetica,sans-serif;font-size:12px">C string to be compared.</dd></div></div></div></div></blockquote></div></div><div>I think they're simplifying. But <a href="http://cppreference.com" target="_blank">cppreference.com</a> also disagrees with me: <a href="http://en.cppreference.com/w/cpp/string/byte/strncmp" target="_blank">http://en.cppreference.com<wbr>/w/cpp/string/byte/strncmp</a> . They have separate entries for strncmp in C and for C++ and they differ, for C++ it says they compare two null-terminated strings, while in C it says they compare two possibly null-terminated strings.</div><div><br></div><div><div>The C++ standard doesn't define either strcmp or strncmp. I think the normative text is in 21.8 [c.strings]:</div></div><div>"1 Tables 74, 75, 76, 77, 78, and 79 describe headers <cctype>, <cwctype>, <cstring>, <cwchar>, <cstdlib></div><div>(character conversions), and <cuchar>, respectively."</div><div><br></div><div>The tables list all the various functions, table 76 includes both strcmp and strncmp.<br></div><div><br></div><div>"2 The contents of these headers shall be the same as the Standard C Library headers <ctype.h>, <wctype.h>,</div><div><string.h>, <wchar.h>, and <stdlib.h> and the C Unicode TR header <uchar.h>, respectively, with the</div><div>following modifications:"</div><div><br></div><div>The modifications that follow affect types char16_t, char32_t and wchar_t and functions strchr, strpbrk, strrchr, strstr, memchr, wcschr, wcspbrk, wcsrchr, wcsstr, wmemchr, strerror, strtok, as well as anything that takes an mbstate_t*.</div><div><br></div><div>I can't see anything else that would make the semantics different between C and C++.</div><span><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>I probably still need to replace this with something like </div><div><div>COMMON_INTERCEPTOR_READ_RANGE(<wbr>(ctx), (s), \</div><div> common_flags()->strict_string_<wbr>checks ? size : (Min(i + 1, size)) )</div></div><div>to avoid calling strlen</div></div></div></div></blockquote><div><br></div></span><div>I don't think that's quite right either. strncmp will read up to the first NUL at which point it stops, or it reads up to size. That makes it:</div><div><br></div><div><div> unsigned char c1 = 0, c2 = 0;</div><div> uptr i;</div><div> for (i = 0; i < size; i++) {</div><div> c1 = (unsigned char)s1[i];</div><span><div> c2 = (unsigned char)s2[i];</div></span><div> if (c1 != c2 || c1 == '\0') {</div><div> ++i; // include trailing NUL</div><div> break;</div><div> }</div><div> }</div></div><div> assert(i <= size);</div><div> COMMON_INTERCEPTOR_READ_RANGE(<wbr>(ctx), (s1), i)<br></div><div><div> COMMON_INTERCEPTOR_READ_RANGE(<wbr>(ctx), (s2), i)<br></div></div><div><br></div><div>and there's no need to check for strict string handling?</div><span class="m_-6893603495633923300HOEnZb"><font color="#888888"><div><br></div><div>Nick</div></font></span><div><div class="m_-6893603495633923300h5"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div class="m_-6893603495633923300m_5717655743882326564gmail-h5"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="m_-6893603495633923300m_5717655743882326564gmail-m_-4476846632077403236gmail-HOEnZb"><font color="#888888"><div>Nick<br></div></font></span><div><div class="m_-6893603495633923300m_5717655743882326564gmail-m_-4476846632077403236gmail-h5"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
int result = CharCmpX(c1, c2);<br>
CALL_WEAK_INTERCEPTOR_HOOK(__<wbr>sanitizer_weak_hook_strncmp, GET_CALLER_PC(), s1,<br>
s2, size, result);<br>
<br>
Added: compiler-rt/trunk/test/asan/Te<wbr>stCases/strncmp_strict.c<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/asan/TestCases/strncmp_strict.c?rev=284901&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/compiler-rt/trunk/test/a<wbr>san/TestCases/strncmp_strict.c<wbr>?rev=284901&view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- compiler-rt/trunk/test/asan/Te<wbr>stCases/strncmp_strict.c (added)<br>
+++ compiler-rt/trunk/test/asan/Te<wbr>stCases/strncmp_strict.c Fri Oct 21 18:52:26 2016<br>
@@ -0,0 +1,26 @@<br>
+// Test strict_string_checks option in strncmp function<br>
+// RUN: %clang_asan %s -o %t && %run %t 2>&1<br>
+// RUN: %env_asan_opts=strict_string_c<wbr>hecks=false %run %t 2>&1<br>
+// RUN: %env_asan_opts=strict_string_c<wbr>hecks=true not %run %t 2>&1 | FileCheck %s<br>
+<br>
+#include <assert.h><br>
+#include <stdlib.h><br>
+#include <string.h><br>
+<br>
+int main(int argc, char **argv) {<br>
+ size_t size = 100;<br>
+ char fill = 'o';<br>
+ char *s1 = (char*)malloc(size);<br>
+ memset(s1, fill, size);<br>
+ char *s2 = (char*)malloc(size);<br>
+ memset(s2, fill, size);<br>
+ s1[size - 1] = 'z';<br>
+ s2[size - 1] = 'x';<br>
+ int r = strncmp(s1, s2, size + 1);<br>
+ // CHECK: {{.*ERROR: AddressSanitizer: heap-buffer-overflow on address}}<br>
+ // CHECK: READ of size 101<br>
+ assert(r == 1);<br>
+ free(s1);<br>
+ free(s2);<br>
+ return 0;<br>
+}<br>
<br>
<br>
______________________________<wbr>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
</blockquote></div></div></div><br><br></div></div>
</blockquote></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>