<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Invalid code generation from SIMD intrinsics when function is inlined"
href="https://bugs.llvm.org/show_bug.cgi?id=36723">36723</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Invalid code generation from SIMD intrinsics when function is inlined
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>4.0
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>-New Bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>sylvain.laperche@scality.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=20061" name="attach_20061" title="testcase">attachment 20061</a> <a href="attachment.cgi?id=20061&action=edit" title="testcase">[details]</a></span>
testcase
Hi,
The attached code is not compiled correctly by Clang 4.x when the code is
inlined.
To reproduce the issue, the code code can be compiled with:
clang++ -std=c++11 -O2 -msse4.1 -o testcase testcase.cpp
Then, it can be ran with
./testcase 25532 0
I had to pass the values on the command-line, otherwise everything is computed
at compile-time :D, and with no optimisation enabled the bug doesn't occurs.
The code seems "undefined behavior"-free, but I may have missed something.
Valgrind and `-fsanitize=undefined` don't complains.
The function sub_test is correctly compiled (checked with objdump):
0000000000400990 <_Z8sub_testooj>:
400990: 66 41 0f 6e c0 movd %r8d,%xmm0
400995: 66 0f 70 c0 00 pshufd $0x0,%xmm0,%xmm0
40099a: 66 48 0f 6e ce movq %rsi,%xmm1
40099f: 66 48 0f 6e d7 movq %rdi,%xmm2
4009a4: 66 0f 6c d1 punpcklqdq %xmm1,%xmm2
4009a8: 66 48 0f 6e c9 movq %rcx,%xmm1
4009ad: 66 48 0f 6e da movq %rdx,%xmm3
4009b2: 66 0f 6c d9 punpcklqdq %xmm1,%xmm3
4009b6: 66 0f 6f cb movdqa %xmm3,%xmm1
4009ba: 66 0f 66 ca pcmpgtd %xmm2,%xmm1
4009be: 66 0f db c8 pand %xmm0,%xmm1
4009c2: 66 0f fa d3 psubd %xmm3,%xmm2
4009c6: 66 0f fe d1 paddd %xmm1,%xmm2
4009ca: 66 48 0f 7e d0 movq %xmm2,%rax
4009cf: 66 48 0f 3a 16 d2 01 pextrq $0x1,%xmm2,%rdx
4009d6: c3 retq
4009d7: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
4009de: 00 00
*BUT* when the function is inlined (in main), the code becomes buggy:
[…]
400af3: 66 48 0f 6e c0 movq %rax,%xmm0
400af8: 66 48 0f 6e ca movq %rdx,%xmm1
400afd: 66 0f 6c c8 punpcklqdq %xmm0,%xmm1
400b01: 66 48 0f 6e c1 movq %rcx,%xmm0
400b06: 66 48 0f 6e d6 movq %rsi,%xmm2
400b0b: 66 0f 6c d0 punpcklqdq %xmm0,%xmm2
400b0f: 66 0f 6f c2 movdqa %xmm2,%xmm0
400b13: 66 0f 66 c1 pcmpgtd %xmm1,%xmm0
400b17: 66 0f 72 d0 1f psrld $0x1f,%xmm0
400b1c: 66 0f fa ca psubd %xmm2,%xmm1
400b20: 66 0f fe c8 paddd %xmm0,%xmm1
400b24: 66 48 0f 7e c8 movq %xmm1,%rax
400b29: 66 48 0f 3a 16 c9 01 pextrq $0x1,%xmm1,%rcx
[…]
The `pand` (corresponding to `_mm_and_si128`) is replaced by a `psrld` (just
after the `pcmpgtd`)
The bug was first encountered on MacOS with the following version of Clang
% g++ --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr
--with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Which correspond to Clang 4.0.3 according to
<a href="https://en.wikipedia.org/wiki/Xcode#Latest_versions">https://en.wikipedia.org/wiki/Xcode#Latest_versions</a>
I was able to reproduce it on Ubuntu 14.04, using the Clang 4.0 from "deb
<a href="http://apt.llvm.org/trusty/">http://apt.llvm.org/trusty/</a> llvm-toolchain-trusty-4.0 main"
% clang++ --version
clang version 4.0.1-svn305264-1~exp1 (branches/release_40)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Note that the generated code is correct with Clang 3.4 (tested on Ubuntu 14.04)
% clang++ --version
Ubuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM
3.4)
Target: x86_64-pc-linux-gnu
Thread model: posix
No problem as well with Clang 5.0 (tested on Archlinux)
% clang++ --version
clang version 5.0.1 (tags/RELEASE_501/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
So this seems specific to the 4.x branch.
I don't know the policy (do bugs get fixed in the 4.x branch or is "upgrade
your compiler" the official fix?) so I report it just in case.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>