<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - ASM-based SIMD code -O2 performance and -g compilation errors"
href="http://llvm.org/bugs/show_bug.cgi?id=17195">17195</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>ASM-based SIMD code -O2 performance and -g compilation errors
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>3.3
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>C++
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>hadsell@blueskystudios.com
</td>
</tr>
<tr>
<th>CC</th>
<td>dgregor@apple.com, llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=11182" name="attach_11182" title="test case described in bug report">attachment 11182</a> <a href="attachment.cgi?id=11182&action=edit" title="test case described in bug report">[details]</a></span>
test case described in bug report
Our C++ code embeds asm code for a calculation using SIMD instructions. There
are 2 problems demonstrated by the attached test case.
(1) With G++ I have to use -fno-dse for optimized code; without it the code
runs much more slowly than the equivalent non-SIMD version. With Clang++,
there is no such option, and the SIMD code runs more slowly than the non-SIMD
version.
The performance numbers look like this, running on Fedora 14 with a Xeon(R) CPU
X5680 @ 3.33GHz:
Clang++ SIMD version: 8.3 s.
Clang++ non-SIMD version: 5.8 s.
G++ SIMD version: 8.7 s.
G++ non-SIMD version: 11.2 s.
In the attached test case you can select the SIMD or non-SIMD version with a
#define macro in line 6.
Our Clang++ 3.3 was built with G++ 4.7.2. I uses these options for compiling
and linking: -march=core2 -msse4.1 -m64 -std=c++0x -fPIC -pthread
-Wno-logical-op-parentheses -Wno-shift-op-parentheses -O2 -g
I used G++ 4.5.1 with these options: -march=core2 -msse4.1 -m64 -mpc64
-std=c++0x -pedantic-errors -mieee-fp -fPIC -pthread -O2 -g
-fno-strict-aliasing -fno-tree-ccp -fno-dse
(2) Compiling with -g for a debug (non-optimized) version produces errors like
this for the SIMD version:
clang_test.cc:373:2: error: ran out of registers during register allocation
compute_factors_t (f0, f1, f2, f3, f0_1, f1_1, f2_1, f3_1,
^
clang_test.cc:197:10: note: expanded from macro 'compute_factors_t'
asm ("movapd %1, %%xmm0 \n\t" \
^</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>