Thank you.<div><br></div><div>I am working on a machine with greater vector widths. How to enable the emission of greater and different vector widths in loop codes through pragma ? and automatically.<br><br>On Friday, September 21, 2018, Friedman, Eli <<a href="mailto:efriedma@codeaurora.org">efriedma@codeaurora.org</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>On 9/20/2018 2:15 PM, hameeza ahmed
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>Hello,</div>
<div>I m trying to set vector width using #pragma clang
loop vectorize_width(32) but i m getting width 8 for the
following kernel;</div>
<br>
<div><b>i m getting following output when i compiled;</b></div>
<div><b><br>
</b></div>
<div><b>clang -O3 correlation.c -Rpass=loop-vectorize
-emit-llvm -march=knl -S -o 1.ll<br>
correlation.c:38:9: remark: vectorized loop
(vectorization width: 8, interleaved count: 4)
[-Rpass=loop-vectorize]<br>
for (j = 0; j < M; j++)<br>
^<br>
</b></div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
With AVX-512, an instruction can operate on at most 8
double-precision lanes. The vectorizer recognizes that, and
interleaves the loop so you get 8*4==32 scalar iterations per
iteration of the vectorized loop.<br>
<br>
-Eli<br>
<pre cols="72">--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</div>
</blockquote></div>