<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - -O1 replaces strncmp with strcmp even if one param is not null-terminated"
href="https://bugs.llvm.org/show_bug.cgi?id=32357">32357</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>-O1 replaces strncmp with strcmp even if one param is not null-terminated
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>kcc@google.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
</td>
</tr></table>
<p>
<div>
<pre>% cat z.c
#include <string.h>
char magic[8];
int foo() { return strncmp(magic, "1234567", 8); }
% clang -S -o - z.c -O0 | grep 'str.*cmp'
callq strncmp
% clang -S -o - z.c -O1 | grep 'str.*cmp'
jmp strcmp # TAILCALL
Richard says:
It's a bug.
C11 7.1.1: "A string is a contiguous sequence of characters terminated by and
including the first null character. [...] A pointer to a string is a pointer to
its initial (lowest addressed) character."
C11 7.24.4.2: "The strcmp function compares the string pointed to by s1 to the
string pointed to by s2."
C11 7.24.4.4: "The strncmp function compares not more than n characters
(characters that follow a null character are not compared) from the array
pointed to by s1 to the array pointed to by s2."
Note that strcmp requires its arguments to be pointers to strings, but strncmp
only requires them to be pointers to arrays. Calling strcmp with pointers that
do not point to strings (presumably) results in undefined behavior. In the
above program, there is no guarantee that "magic" is a pointer to a string, so
it is not correct to convert the strncmp call to a strcmp call.
A hostile-but-valid strcmp implementation could first apply strlen to both of
its arguments. (And a clever implementation could probably get some performance
advantage by scanning ahead looking for a nul in the next, say, 16 bytes, and
then doing a 16-byte-at-a-time comparison.)</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>