[llvm-dev] Vectorization width not correct using #pragma clang loop vectorize_width
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Thu Sep 20 15:29:56 PDT 2018
On 9/20/2018 2:15 PM, hameeza ahmed wrote:
> Hello,
> I m trying to set vector width using #pragma clang loop
> vectorize_width(32) but i m getting width 8 for the following kernel;
>
> *i m getting following output when i compiled;*
> *
> *
> *clang -O3 correlation.c -Rpass=loop-vectorize -emit-llvm
> -march=knl -S -o 1.ll
> correlation.c:38:9: remark: vectorized loop (vectorization width: 8,
> interleaved count: 4) [-Rpass=loop-vectorize]
> for (j = 0; j < M; j++)
> ^
> *
With AVX-512, an instruction can operate on at most 8 double-precision
lanes. The vectorizer recognizes that, and interleaves the loop so you
get 8*4==32 scalar iterations per iteration of the vectorized loop.
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180920/13c67b42/attachment.html>
More information about the llvm-dev
mailing list