<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>I am forwarding this to cfe-dev as it might this sounds like a
bug and cfe-users is not read that much.<br>
</p>
<div class="moz-forward-container"><br>
<br>
-------- Weitergeleitete Nachricht --------
<table class="moz-email-headers-table" cellspacing="0"
cellpadding="0" border="0">
<tbody>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">Betreff:
</th>
<td>[cfe-users] floor is vectorized, but not sin, cos or exp</td>
</tr>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">Datum: </th>
<td>Tue, 11 Dec 2018 16:47:16 +0100</td>
</tr>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">Von: </th>
<td>Klaus Leppkes via cfe-users
<a class="moz-txt-link-rfc2396E" href="mailto:cfe-users@lists.llvm.org"><cfe-users@lists.llvm.org></a></td>
</tr>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">Antwort
an: </th>
<td>Klaus Leppkes <a class="moz-txt-link-rfc2396E" href="mailto:klaus.leppkes@rwth-aachen.de"><klaus.leppkes@rwth-aachen.de></a></td>
</tr>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">An: </th>
<td><a class="moz-txt-link-abbreviated" href="mailto:cfe-users@lists.llvm.org">cfe-users@lists.llvm.org</a></td>
</tr>
</tbody>
</table>
<br>
<br>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<p>Hi,</p>
<p>according to the doc (<a class="moz-txt-link-freetext"
href="https://releases.llvm.org/7.0.0/docs/Vectorizers.html"
moz-do-not-send="true">https://releases.llvm.org/7.0.0/docs/Vectorizers.html</a>)
floor, sin, cos should be vectorized.</p>
<p>I can confirm (using the great <a
class="moz-txt-link-freetext" href="https://gcc.godbolt.org/"
moz-do-not-send="true">https://gcc.godbolt.org/</a> tool) that
using the flags "-Ofast -mavx2 -fopenmp -ffast-math" the right
avx2 opcode (<span style="color: #0000ff;">vroundps</span>) is
emited for floor (in foo), but unfortunately not for sin, cos or
exp (e.g. see sin in bar below). <br>
</p>
<p>GCC 8.1+ and the Intel Compiler icc 13+ insert call to
vectorized implementations (<span style="color: #008080;">_ZGVbN4v_sinf</span>
or <span style="color: #008080;">__svml_sinf4 </span>), but
clang seems to have nothing like this.<br>
</p>
<div style="color: #000000;background-color:

#fffffe;font-family:
 Consolas, ">
<div><span style="color: #008080;"><font color="#000000">Here is
my small testcode:</font><br>
</span></div>
<div><span style="color: #008080;"><br>
</span></div>
</div>
<div style="color: #000000;background-color:

#fffffe;font-family:
 Consolas, ">
<div><span style="color: #0000ff;">#include</span><span
style="color: #000000;"> <cmath></span></div>
<br>
<div><span style="color: #0000ff;">void</span><span
style="color:
 #000000;"> foo(</span><span
style="color:
 #0000ff;">float</span><span style="color:
#000000;"> * </span><span style="color:
 #0000ff;">__restrict</span><span
style="color: #000000;"> __attribute((aligned(</span><span
style="color: #09885a;">32</span><span style="color:
#000000;">))) x</span></div>
<div><span style="color: #000000;">, </span><span
style="color:
 #0000ff;">float</span><span
style="color:
 #000000;"> * </span><span style="color:
#0000ff;">__restrict</span><span style="color:

#000000;"> __attribute((aligned(</span><span
style="color:
 #09885a;">32</span><span
style="color:
 #000000;">))) y) {</span></div>
<div><span style="color: #000000;"> </span><span
style="color:
 #0000ff;">for</span><span
style="color:
 #000000;"> (</span><span style="color:
#0000ff;">int</span><span style="color:
 #000000;"> i =
</span><span style="color:
 #09885a;">0</span><span
style="color: #000000;">; i < </span><span
style="color:
 #09885a;">4</span><span
style="color:
 #000000;">; ++i)</span></div>
<div><span style="color: #000000;"> y[i] = floor(x[i]);</span></div>
<div><span style="color: #000000;">}</span></div>
<br>
<br>
<div><span style="color: #0000ff;">void</span><span
style="color:
 #000000;"> bar(</span><span
style="color:
 #0000ff;">float</span><span style="color:
#000000;"> * </span><span style="color:
 #0000ff;">__restrict</span><span
style="color: #000000;"> __attribute((aligned(</span><span
style="color: #09885a;">32</span><span style="color:
#000000;">))) x</span></div>
<div><span style="color: #000000;">, </span><span
style="color:
 #0000ff;">float</span><span
style="color:
 #000000;"> * </span><span style="color:
#0000ff;">__restrict</span><span style="color:

#000000;"> __attribute((aligned(</span><span
style="color:
 #09885a;">32</span><span
style="color:
 #000000;">))) y) {</span></div>
<div><span style="color: #000000;"> </span><span
style="color:
 #0000ff;">for</span><span
style="color:
 #000000;"> (</span><span style="color:
#0000ff;">int</span><span style="color:
 #000000;"> i =
</span><span style="color:
 #09885a;">0</span><span
style="color: #000000;">; i < </span><span
style="color:
 #09885a;">4</span><span
style="color:
 #000000;">; ++i)</span></div>
<div><span style="color: #000000;"> y[i] = sin(x[i]);</span></div>
<div><span style="color: #000000;">}</span></div>
</div>
<div style="color: #000000;background-color:

#fffffe;font-family:
 Consolas, "><br>
</div>
<div style="color: #000000;background-color:

#fffffe;font-family:
 Consolas, ">I have reproduced this
behavior on different machines. Maybe I am doing s.th. wrong
here, but it seems like there is no vectorized implementation
for sin, cos etc. I am using h2lib for now (<a
class="moz-txt-link-freetext"
href="http://h2lib.org/doc/d1/d89/simd__avx_8h_source.html"
moz-do-not-send="true">http://h2lib.org/doc/d1/d89/simd__avx_8h_source.html</a>)
as a workaround, but I expect clang to do this job.</div>
<div style="color: #000000;background-color:

#fffffe;font-family:
 Consolas, "><br>
</div>
<div style="color: #000000;background-color:

#fffffe;font-family:
 Consolas, ">Can anybody comment on
this please?<br>
</div>
<div style="color: #000000;background-color:

#fffffe;font-family:
 Consolas, "><br>
</div>
Cheers<br>
Klaus </div>
</body>
</html>