<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - int have_read = read (...) --- unable to deduce that "have_read >= -1""
href="https://bugs.llvm.org/show_bug.cgi?id=44360">44360</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>int have_read = read (...) --- unable to deduce that "have_read >= -1"
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>C
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>safinaskar@mail.ru
</td>
</tr>
<tr>
<th>CC</th>
<td>blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
</td>
</tr></table>
<p>
<div>
<pre>Here is my attempt to write fast implementation of "wc -l":
----
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main (void)
{
const int block_size = 1024 * 1024;
char *block = (char *)malloc (block_size);
int result = 0;
for (;;)
{
int have_read = read (0, block, block_size);
if (have_read == -1)
{
abort ();
}
if (have_read == 0)
{
break;
}
for (int i = 0; i != have_read; ++i)
{
if (block[i] == '\n')
{
++result;
}
}
}
printf ("%d\n", result);
}
----
I compiled this code using this command: "clang-10 -flto -O3 -x c SOURCE -o
/tmp/block"
Then I run the binary using this: "time -p /tmp/block < DATA" (DATA is video
file with size 7'551'083'315 bytes located on ext4 file system)
Time is 4.44 seconds (of course, I run test multiple times).
Then I replaced "have_read == -1" with "have_read < 0". Time is 3.41 seconds.
So, we see that compiler is unable to optimize original version. I think this
is a bug.
First: compiler should know that "read" always returns something that ">= -1".
Second: compiler should reason so: if "have_read" is negative, that we will
reach "i" overflow in inner loop eventually in "++i". But integer overflow is
UB. Thus we can assume that "have_read" is non-negative.
Also, if I keep "have_read == -1" as is, but change "int have_read" to "ssize_t
have_read", then I will get 3.41 seconds.
Also, gcc (version 6.3.0-18+deb9u1 from Debian) gives 3.38 seconds on initial
code and 3.02 seconds on initial code with ssize_t and 3.01 seconds on initial
code with "have_read < 0"
----
Debian stretch (with some packages from Debian buster), x86_64, Linux 4.19
"clang-10 -v" output:
----
clang version 10.0.0-+20191211115110+02168549172-1~exp1~20191211105657.1646
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/i686-linux-gnu/6
Found candidate GCC installation: /usr/bin/../lib/gcc/i686-linux-gnu/6.3.0
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/6.3.0
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.3.0
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/6.3.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
----
This is clang installed from <a href="https://apt.llvm.org">https://apt.llvm.org</a>
Intel Core i7, hyper-threading is disabled at BIOS level, /proc/cpuinfo reports
4 cores</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>