<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - memset with length 2^N where N=2..7 is vectorized even with -Oz enabled"
href="https://bugs.llvm.org/show_bug.cgi?id=51854">51854</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>memset with length 2^N where N=2..7 is vectorized even with -Oz enabled
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>vdsered@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, pengfei.wang@intel.com, spatel+llvm@rotateright.com
</td>
</tr></table>
<p>
<div>
<pre>Memset is vectorized with flags -Oz and -Os when the length is equal to 2^N
where N=2..7. There is no such behaviour in gcc, for example. I guess, it is
okay to vectorize this code with O3, but for Oz this shouldn't be done.
Source:
void func(int *P) {
memset(P, 0, 128);
}
Clang's output with Oz (trunk, <a href="https://godbolt.org/z/a6vjjxKhz">https://godbolt.org/z/a6vjjxKhz</a>):
func(int*, int): # @func(int*, int)
xorps xmm0, xmm0
movups xmmword ptr [rdi + 112], xmm0
movups xmmword ptr [rdi + 96], xmm0
movups xmmword ptr [rdi + 80], xmm0
movups xmmword ptr [rdi + 64], xmm0
movups xmmword ptr [rdi + 48], xmm0
movups xmmword ptr [rdi + 32], xmm0
movups xmmword ptr [rdi + 16], xmm0
movups xmmword ptr [rdi], xmm0
ret
If length > 128 with Oz/Os, then we generate this:
func(int*, int): # @func(int*, int)
mov edx, 256
xor esi, esi
jmp memset@PLT # TAILCALL
For gcc with Os the output is the same for any length (see
<a href="https://godbolt.org/z/1shqe319r">https://godbolt.org/z/1shqe319r</a>):
func(int*, int):
mov ecx, X <-- X is the length
xor eax, eax
rep stosd
ret
So we expect that with Os and Oz flags we don't vectorize and generate the same
code as for the case with length > 128</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>