<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p><br>
</p>
<br>
<div class="moz-cite-prefix">On 06/20/2017 07:21 PM, hameeza ahmed
via llvm-dev wrote:<br>
</div>
<blockquote
cite="mid:CAFMPKeZbyciXPKA6SuU09r9CXTX4J0twdjSESCKTT06HUmPoxA@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div dir="ltr">Hello,
<div><br>
</div>
<div>I am using llvm on my core i7 laptop which has no avx
support.</div>
<div><br>
</div>
<div>my goal is to generate avx512 code (loop vectorization) for
Knight landing/skylake . </div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>my .c code is;</div>
<div><br>
</div>
<div>
<div>int a[256], b[256], c[256];</div>
<div>foo () {</div>
</div>
</div>
</blockquote>
<br>
void foo() {<br>
<br>
<blockquote
cite="mid:CAFMPKeZbyciXPKA6SuU09r9CXTX4J0twdjSESCKTT06HUmPoxA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div>int i;</div>
<div>for (i=0; i<256; i++) {</div>
<div>a[i] = b[i] + c[i];</div>
<div>}</div>
<div>}</div>
</div>
<div><br>
</div>
<div>i first generated its .ll file via clang </div>
<div><br>
</div>
<div>clang -S -emit-llvm test.c -o test.ll<br>
</div>
</div>
</blockquote>
<br>
Your problem is that vectorization happens in opt, not in llc.
Telling llc that you wish to enable AVX-512 is not sufficient. In
fact, if you run:<br>
<br>
clang -S -o - test.c -march=knl -O3<br>
<br>
you'll see AVX-512 vectorized code. If you want to run opt
separately to generate the vectorized code, you need to tell it that
it is targeting the KNL. Clang can add the necessary function
attributes to do this. You'll also want to run clang with
optimizations enabled so that it will generate IR that is intended
to be optimized, even if you then disable the actual optimizaitons
to get the pre-opt IR.<br>
<br>
clang -S -emit-llvm test.c -march=knl -O3 -mllvm
-disable-llvm-optzns<br>
<br>
then running opt as you have it below should produce the desired
result.<br>
<br>
Finally, I recommend upgrading to Clang/LLVM 4.0. It produces better
AVX-512 code than 3.9 did.<br>
<br>
-Hal<br>
<br>
<blockquote
cite="mid:CAFMPKeZbyciXPKA6SuU09r9CXTX4J0twdjSESCKTT06HUmPoxA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
<div>then i optimized it;</div>
<div><br>
</div>
<div>opt -S -O3 test.ll -o test_o3.ll<br>
</div>
<div><br>
</div>
<div>then i used llc for code generation</div>
<div><br>
</div>
<div>llc -mcpu=skylake-avx512 -mattr=+avx512f test_o3.ll -o
test_o3.s<br>
</div>
<div><br>
</div>
<div>llc -mcpu=knl -mattr=+avx512f test_o3.ll -o test_o3.s<br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>here is my generated code;</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div><span style="white-space:pre"> </span>.text</div>
<div><span style="white-space:pre"> </span>.file<span style="white-space:pre"> </span>"filer_o3.ll"</div>
<div><span style="white-space:pre"> </span>.globl<span style="white-space:pre"> </span>foo</div>
<div><span style="white-space:pre"> </span>.p2align<span style="white-space:pre"> </span>4,
0x90</div>
<div><span style="white-space:pre"> </span>.type<span style="white-space:pre"> </span>foo,@function</div>
<div>foo: # @foo</div>
<div><span style="white-space:pre"> </span>.cfi_startproc</div>
<div># BB#0: #
%min.iters.checked</div>
<div><span style="white-space:pre"> </span>pushq<span style="white-space:pre"> </span>%rbp</div>
<div>.Ltmp0:</div>
<div><span style="white-space:pre"> </span>.cfi_def_cfa_offset
16</div>
<div>.Ltmp1:</div>
<div><span style="white-space:pre"> </span>.cfi_offset %rbp,
-16</div>
<div><span style="white-space:pre"> </span>movq<span style="white-space:pre"> </span>%rsp,
%rbp</div>
<div>.Ltmp2:</div>
<div><span style="white-space:pre"> </span>.cfi_def_cfa_register
%rbp</div>
<div><span style="white-space:pre"> </span>movq<span style="white-space:pre"> </span>$-1024,
%rax # imm = 0xFC00</div>
<div><span style="white-space:pre"> </span>.p2align<span style="white-space:pre"> </span>4,
0x90</div>
<div>.<b><font color="#0000ff">LBB0_1:
# %vector.body</font></b></div>
<div><b><font color="#0000ff">
# =>This Inner Loop Header: Depth=1</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vmovdqa32<span style="white-space:pre"> </span>c+1024(%rax),
%xmm0</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vmovdqa32<span style="white-space:pre"> </span>c+1040(%rax),
%xmm1</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vpaddd<span style="white-space:pre"> </span>b+1024(%rax),
%xmm0, %xmm0</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vpaddd<span style="white-space:pre"> </span>b+1040(%rax),
%xmm1, %xmm1</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vmovdqa32<span style="white-space:pre"> </span>%xmm0,
a+1024(%rax)</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vmovdqa32<span style="white-space:pre"> </span>%xmm1,
a+1040(%rax)</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vmovdqa32<span style="white-space:pre"> </span>c+1056(%rax),
%xmm0</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vmovdqa32<span style="white-space:pre"> </span>c+1072(%rax),
%xmm1</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vpaddd<span style="white-space:pre"> </span>b+1056(%rax),
%xmm0, %xmm0</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vpaddd<span style="white-space:pre"> </span>b+1072(%rax),
%xmm1, %xmm1</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vmovdqa32<span style="white-space:pre"> </span>%xmm0,
a+1056(%rax)</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>vmovdqa32<span style="white-space:pre"> </span>%xmm1,
a+1072(%rax)</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>addq<span style="white-space:pre"> </span>$64,
%rax</font></b></div>
<div><b><font color="#0000ff"><span style="white-space:pre"> </span>jne<span style="white-space:pre"> </span>.LBB0_1</font></b></div>
<div># BB#2: # %middle.block</div>
<div><span style="white-space:pre"> </span>popq<span style="white-space:pre"> </span>%rbp</div>
<div><span style="white-space:pre"> </span>retq</div>
<div>.Lfunc_end0:</div>
<div><span style="white-space:pre"> </span>.size<span style="white-space:pre"> </span>foo,
.Lfunc_end0-foo</div>
<div><span style="white-space:pre"> </span>.cfi_endproc</div>
<div><br>
</div>
<div><span style="white-space:pre"> </span>.type<span style="white-space:pre"> </span>b,@object
# @b</div>
<div><span style="white-space:pre"> </span>.comm<span style="white-space:pre"> </span>b,1024,16</div>
<div><span style="white-space:pre"> </span>.type<span style="white-space:pre"> </span>c,@object
# @c</div>
<div><span style="white-space:pre"> </span>.comm<span style="white-space:pre"> </span>c,1024,16</div>
<div><span style="white-space:pre"> </span>.type<span style="white-space:pre"> </span>a,@object
# @a</div>
<div><span style="white-space:pre"> </span>.comm<span style="white-space:pre"> </span>a,1024,16</div>
<div><br>
</div>
<div><span style="white-space:pre"> </span>.ident<span style="white-space:pre"> </span>"clang
version 3.9.0 (tags/RELEASE_390/final)"</div>
<div><span style="white-space:pre"> </span>.section<span style="white-space:pre"> </span>".note.GNU-stack","",@progbits</div>
</div>
<div><br>
</div>
<div>in the generated code although there is use of vmov...
instructions but no zmm register? only xmm registers.</div>
<div><br>
</div>
<div><br>
</div>
<div>Can you please specify where i am wrong. i have tried it
several times by different parameters but always get xmm
registers.</div>
<div><br>
</div>
<div><br>
</div>
<div>Thank You</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</body>
</html>