<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><span class="vcard"><a class="email" href="mailto:kfjahnke@gmail.com" title="Kay F. Jahnke <kfjahnke@gmail.com>"> <span class="fn">Kay F. Jahnke</span></a>
</span> changed
<a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED - autovectorization of repeated calls to vectorizable functions fails"
href="https://bugs.llvm.org/show_bug.cgi?id=40265">bug 40265</a>
<br>
<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>What</th>
<th>Removed</th>
<th>Added</th>
</tr>
<tr>
<td style="text-align:right;">Resolution</td>
<td>FIXED
</td>
<td>---
</td>
</tr>
<tr>
<td style="text-align:right;">Status</td>
<td>RESOLVED
</td>
<td>REOPENED
</td>
</tr></table>
<p>
<div>
<b><a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED - autovectorization of repeated calls to vectorizable functions fails"
href="https://bugs.llvm.org/show_bug.cgi?id=40265#c8">Comment # 8</a>
on <a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED - autovectorization of repeated calls to vectorizable functions fails"
href="https://bugs.llvm.org/show_bug.cgi?id=40265">bug 40265</a>
from <span class="vcard"><a class="email" href="mailto:kfjahnke@gmail.com" title="Kay F. Jahnke <kfjahnke@gmail.com>"> <span class="fn">Kay F. Jahnke</span></a>
</span></b>
<pre>I'd like to raise the topic again. Using the compiler flags which resulted in
autovectorization of my example loop running sqrt, I tried a few other standard
functions, notably the trigonometric ones, log and exp. Contrary to the
documentation, I could see no indication of these functions being
autovectorized on my AVX2 machine. Instead the maths is done by issuing a callq
to the respective standard function (like, callq sinf). I assume that these
functions are not autovectorized for my system because there are no
corresponding assembler instructions - the equivalent of what vsqrtps and
vsqrtpd are in the sqrt case - and vectorizing these functions for intel ISA
requires additional code. So I'd like to find out whether I am again missing
something or whether autovectorization of these functions simply does not
happen for SSE/AVX/AVX2, in which case I'd propose to give yet another hint in
the documentation. Here's the example code I used calling sin repeatedly:
#include <cmath>
extern float data [ 32768 ] ;
extern void vf1()
{
#pragma clang loop vectorize(enable)
for ( int i = 0 ; i < 32768 ; i++ )
data [ i ] = std::sin ( data [ i ] ) ;
}
I compiled like this:
$ clang++ -fvectorize -Rpass=loop-vectorize -Rpass-analysis=loop-vectorize
-fno-math-errno -std=c++11 -O3 -mavx2 -S -o xx.s xx.cc
xx.cc:8:3: remark: the cost-model indicates that interleaving is not beneficial
[-Rpass-analysis=loop-vectorize]
for ( int i = 0 ; i < 32768 ; i++ )
^
xx.cc:8:3: remark: vectorized loop (vectorization width: 8, interleaved count:
1) [-Rpass=loop-vectorize]
using this clang++ version:
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
Target: x86_64-pc-linux-gnu
Kay</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>