<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - ASM-based SIMD code -O2 performance and -g compilation errors"
   href="http://llvm.org/bugs/show_bug.cgi?id=17195">17195</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>ASM-based SIMD code -O2 performance and -g compilation errors
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>3.3
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>C++
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>hadsell@blueskystudios.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>dgregor@apple.com, llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=11182" name="attach_11182" title="test case described in bug report">attachment 11182</a> <a href="attachment.cgi?id=11182&action=edit" title="test case described in bug report">[details]</a></span>
test case described in bug report

Our C++ code embeds asm code for a calculation using SIMD instructions.  There
are 2 problems demonstrated by the attached test case.

(1) With G++ I have to use -fno-dse for optimized code; without it the code
runs much more slowly than the equivalent non-SIMD version.  With Clang++,
there is no such option, and the SIMD code runs more slowly than the non-SIMD
version.

The performance numbers look like this, running on Fedora 14 with a Xeon(R) CPU
X5680 @ 3.33GHz:

Clang++ SIMD version: 8.3 s.
Clang++ non-SIMD version: 5.8 s.
G++ SIMD version: 8.7 s.
G++ non-SIMD version: 11.2 s.

In the attached test case you can select the SIMD or non-SIMD version with a
#define macro in line 6.

Our Clang++ 3.3 was built with G++ 4.7.2.  I uses these options for compiling
and linking: -march=core2 -msse4.1 -m64 -std=c++0x -fPIC -pthread
-Wno-logical-op-parentheses -Wno-shift-op-parentheses -O2 -g

I used G++ 4.5.1 with these options: -march=core2 -msse4.1 -m64 -mpc64
-std=c++0x -pedantic-errors -mieee-fp -fPIC -pthread -O2 -g
-fno-strict-aliasing -fno-tree-ccp -fno-dse

(2) Compiling with -g for a debug (non-optimized) version produces errors like
this for the SIMD version:

clang_test.cc:373:2: error: ran out of registers during register allocation
        compute_factors_t (f0, f1, f2, f3, f0_1, f1_1, f2_1, f3_1,
        ^
clang_test.cc:197:10: note: expanded from macro 'compute_factors_t'
    asm ("movapd %1, %%xmm0                     \n\t"    \
         ^</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>