[llvm-dev] unable to emit vectorized code in LLVM IR

Craig Topper via llvm-dev llvm-dev at lists.llvm.org
Thu Aug 17 12:28:48 PDT 2017


So your clang command line says -march=knl. Are you using a Xeon Phi to run
your code? if not that's why you're failing. Try changing it to
-march=native

~Craig

On Thu, Aug 17, 2017 at 12:21 PM, hameeza ahmed <hahmed2305 at gmail.com>
wrote:

> lli sum-vec03.ll  5 2 #0 0x0000000000c1f818 (lli+0xc1f818)
> #1 0x0000000000c1d90e (lli+0xc1d90e)
> #2 0x0000000000c1da5c (lli+0xc1da5c)
> #3 0x00007f987c2c3d10 __restore_rt (/lib/x86_64-linux-gnu/
> libpthread.so.0+0x10d10)
> #4 0x00007f987c6f0038
> #5 0x0000000000989f8c (lli+0x989f8c)
> #6 0x00000000009383dc (lli+0x9383dc)
> #7 0x000000000057eedd (lli+0x57eedd)
> #8 0x00007f987b464a40 __libc_start_main (/lib/x86_64-linux-gnu/libc.
> so.6+0x20a40)
> #9 0x00000000005a5b49 (lli+0x5a5b49)
> Stack dump:
> 0. Program arguments: lli sum-vec03.ll 5 2
> Illegal instruction (core dumped)
>
> No yet there exists no link with those new 2048 element instructions that
> i discussed earlier. i need to link jit with that later. presently, i am
> exploring jit in general to know its capabilities.
>
> Here my main goal is that i need to use JIT to perform operations on user
> input file supplied at run time using vector instructions. is it possible
> and achievable through JIT?
>
> On Fri, Aug 18, 2017 at 12:17 AM, Craig Topper <craig.topper at gmail.com>
> wrote:
>
>> What was your lli command line? Is this based on your code where you
>> created 2048-bit instructions in the x86 backend?
>>
>> ~Craig
>>
>> On Thu, Aug 17, 2017 at 12:12 PM, hameeza ahmed <hahmed2305 at gmail.com>
>> wrote:
>>
>>> Ok. I have managed to vectorize the second loop in the following code.
>>> But the first loop is still not vectorized? Why?
>>>
>>> int main(int argc, char** argv) {
>>> int a[1000], b[1000], c[1000]; int g=0;
>>> int aa=atoi(argv[1]), bb=atoi(argv[2]);
>>>
>>> for (int i=0; i<1000; i++) {
>>> a[i]=aa+i, b[i]=bb+i;}
>>>
>>> for (int i=0; i<1000; i++) {
>>>  c[i]=a[i] + b[i];
>>> g+=c[i];
>>> }
>>>
>>> printf("sum: %d\n", g);
>>>
>>> return 0;
>>> }
>>>
>>> When i executed the optimized IR through jit (lli sum-vec03.ll  5 2) i
>>> am getting following error:
>>>
>>> #0 0x00000000013f965c llvm::sys::PrintStackTrace(llvm::raw_ostream&)
>>> /lib/Support/Unix/Signals.inc:402:11
>>> #1 0x00000000013f9b49 PrintStackTraceSignalHandler(void*)
>>> /lib/Support/Unix/Signals.inc:466:1
>>> #2 0x00000000013f7ec3 llvm::sys::RunSignalHandlers()
>>> /lib/Support/Signals.cpp:0:5
>>> #3 0x00000000013f9ea4 SignalHandler(int) /lib/Support/Unix/Signals.inc:
>>> 256:1
>>> #4 0x00007fcdece96d10 __restore_rt (/lib/x86_64-linux-gnu/libpthr
>>> ead.so.0+0x10d10)
>>> #5 0x00007fcded2c3038
>>> #6 0x0000000000f4a8fb llvm::MCJIT::runFunction(llvm::Function*,
>>> llvm::ArrayRef<llvm::GenericValue>) /lib/ExecutionEngine/MCJIT/MCJ
>>> IT.cpp:538:31
>>> #7 0x0000000000eaff23 llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*,
>>> std::vector<std::__cxx11::basic_string<char, std::char_traits<char>,
>>> std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char,
>>> std::char_traits<char>, std::allocator<char> > > > const&, char const*
>>> const*) /lib/ExecutionEngine/ExecutionEngine.cpp:471:10
>>> #8 0x00000000007be4e9 main /tools/lli/lli.cpp:627:18
>>> #9 0x00007fcdebe2fa40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so
>>> .6+0x20a40)
>>> #10 0x00000000007bc169 _start (/bin/lli+0x7bc169)
>>> Stack dump:
>>> 0. Program arguments:lli sum-vec03.ll 5 2
>>> Illegal instruction (core dumped)
>>>
>>>
>>> What is wrong here? please help.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Aug 17, 2017 at 11:51 PM, hameeza ahmed <hahmed2305 at gmail.com>
>>> wrote:
>>>
>>>> when i change it to following: then get error: remark: <unknown>:0:0:
>>>> loop not vectorized: call instruction cannot be vectorized
>>>> int main(int argc, char** argv) {
>>>> int a[1000], b[1000], c[1000]; int g=0;
>>>> for (int i=0; i<1000; i++) {
>>>> a[i]=atoi(argv[1]), b[i]=atoi(argv[2]);
>>>>  c[i]=a[i] + b[i];
>>>> g+=c[i];
>>>> }
>>>>
>>>> Here my main goal is that i need to use JIT to perform operations on
>>>> user input file supplied at run time using vector instructions. is it
>>>> possible and achievable through JIT?
>>>> Please help.
>>>>
>>>>
>>>>
>>>> On Thu, Aug 17, 2017 at 11:44 PM, Craig Topper <craig.topper at gmail.com>
>>>> wrote:
>>>>
>>>>> I assume compiler knows that your only have 2 input values that you
>>>>> just added together 1000 times.
>>>>>
>>>>> Despite the fact that you stored to a[i] and b[i] here, nothing reads
>>>>> them other than the addition in the same loop iteration. So the compiler
>>>>> easily removed the a and b arrays. Same with 'c', it's not read outside the
>>>>> loop so it doesn't need to exist. So the compiler turned your loop body
>>>>> back into g+= aa + bb; And since the loop is 1000 iterations and aa and bb
>>>>> never change this got further simplified to (aa+bb)*1000.
>>>>>
>>>>> int main(int argc, char** argv) {
>>>>> int a[1000], b[1000], c[1000]; int g=0;
>>>>> int aa=atoi(argv[1]), bb=atoi(argv[2]);
>>>>> for (int i=0; i<1000; i++) {
>>>>> a[i]=aa, b[i]=bb;
>>>>>  c[i]=a[i] + b[i];
>>>>> g+=c[i];
>>>>> }
>>>>>
>>>>> ~Craig
>>>>>
>>>>> On Thu, Aug 17, 2017 at 11:37 AM, hameeza ahmed <hahmed2305 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> why is it happening? is there any way to solve this?
>>>>>>
>>>>>> On Thu, Aug 17, 2017 at 10:09 PM, hameeza ahmed <hahmed2305 at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> even if i make my code as follows: vectorized instructions not get
>>>>>>> emitted. What to do?
>>>>>>>
>>>>>>> int main(int argc, char** argv) {
>>>>>>> int a[1000], b[1000], c[1000]; int g=0;
>>>>>>> int aa=atoi(argv[1]), bb=atoi(argv[2]);
>>>>>>> for (int i=0; i<1000; i++) {
>>>>>>> a[i]=aa, b[i]=bb;
>>>>>>>  c[i]=a[i] + b[i];
>>>>>>> g+=c[i];
>>>>>>> }
>>>>>>>
>>>>>>> printf("sum: %d\n", g);
>>>>>>>
>>>>>>> return 0;
>>>>>>> }
>>>>>>>
>>>>>>> On Thu, Aug 17, 2017 at 10:03 PM, Craig Topper <
>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>
>>>>>>>> Did you remove the printf completely? Meaning that nothing accesses
>>>>>>>> 'c' after the loop? If so it got removed as dead code because it had no
>>>>>>>> visible effect.
>>>>>>>>
>>>>>>>> ~Craig
>>>>>>>>
>>>>>>>> On Thu, Aug 17, 2017 at 10:01 AM, hameeza ahmed <
>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> i removed printf from loop. Now getting no error. but the IR
>>>>>>>>> doesnot contain vectorized code. IR Output is as follows:
>>>>>>>>> ; ModuleID = 'sum-vec.ll'
>>>>>>>>> source_filename = "sum-vec.c"
>>>>>>>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>>>>>> target triple = "x86_64-unknown-linux-gnu"
>>>>>>>>>
>>>>>>>>> ; Function Attrs: norecurse nounwind readnone uwtable
>>>>>>>>> define i32 @main(i32, i8** nocapture readnone) local_unnamed_addr
>>>>>>>>> #0 {
>>>>>>>>>   ret i32 0
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> attributes #0 = { norecurse nounwind readnone uwtable
>>>>>>>>> "correctly-rounded-divide-sqrt-fp-math"="false"
>>>>>>>>> "disable-tail-calls"="false" "less-precise-fpmad"="false"
>>>>>>>>> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false"
>>>>>>>>> "no-jump-tables"="false" "no-nans-fp-math"="false"
>>>>>>>>> "no-signed-zeros-fp-math"="false" "no-trapping-math"="false"
>>>>>>>>> "stack-protector-buffer-size"="8" "target-cpu"="knl"
>>>>>>>>> "target-features"="+adx,+aes,+avx,+avx2,+avx512cd,+avx512er,
>>>>>>>>> +avx512f,+avx512pf,+bmi,+bmi2,+cx16,+f16c,+fma,+fsgsbase,+fx
>>>>>>>>> sr,+lzcnt,+mmx,+movbe,+pclmul,+popcnt,+prefetchwt1,+rdrnd,+r
>>>>>>>>> dseed,+rtm,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt"
>>>>>>>>> "unsafe-fp-math"="false" "use-soft-float"="false" }
>>>>>>>>>
>>>>>>>>> !llvm.ident = !{!0}
>>>>>>>>>
>>>>>>>>> !0 = !{!"clang version 4.0.0 (tags/RELEASE_400/final)"}
>>>>>>>>>
>>>>>>>>> what to do? please help.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Aug 17, 2017 at 9:57 PM, Nemanja Ivanovic <
>>>>>>>>> nemanja.i.ibm at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Move the printf out of the loop and it should vectorize just fine.
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 17, 2017 at 6:52 PM, hameeza ahmed <
>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I want to vectorize the user given inputs. when opt does
>>>>>>>>>>> vectorization user supplied inputs (from a text file) will be added using
>>>>>>>>>>> AVX vector instructions.
>>>>>>>>>>>
>>>>>>>>>>> as you pointed; When i changed my code to following:
>>>>>>>>>>>
>>>>>>>>>>> int main(int argc, char** argv) {
>>>>>>>>>>> int a[1000], b[1000], c[1000];
>>>>>>>>>>> int aa=atoi(argv[1]), bb=atoi(argv[2]);
>>>>>>>>>>> for (int i=0; i<1000; i++) {
>>>>>>>>>>> a[i]=aa, b[i]=bb;
>>>>>>>>>>>  c[i]=a[i] + b[i];
>>>>>>>>>>> printf("sum: %d\n", c[i]);
>>>>>>>>>>>
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> I am getting error remark: <unknown>:0:0: loop not vectorized:
>>>>>>>>>>> call instruction cannot be vectorized.
>>>>>>>>>>>
>>>>>>>>>>> I am running following commands:
>>>>>>>>>>> clang  -S -emit-llvm sum-vec.c -march=knl -O3 -mllvm
>>>>>>>>>>> -disable-llvm-optzns -o sum-vec.ll
>>>>>>>>>>> opt  -S -O3 -force-vector-width=64 sum-vec.ll -o sum-vec03.ll
>>>>>>>>>>>
>>>>>>>>>>> How to achieve this? Please help.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Aug 17, 2017 at 10:44 AM, Nemanja Ivanovic <
>>>>>>>>>>> nemanja.i.ibm at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'm not sure what you expect to have vectorized here. If you
>>>>>>>>>>>> look at the emitted code, there's no loop. It's just an add and a multiply
>>>>>>>>>>>> as you might expect when adding a loop-invariant sum 1000 times in a loop.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Aug 16, 2017 at 11:38 PM, hameeza ahmed via llvm-dev <
>>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>> I have written the following code. when i try to vectorize it
>>>>>>>>>>>>> through opt. i am not getting vectorized instructions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> #include <stdio.h>
>>>>>>>>>>>>> #include<stdlib.h>
>>>>>>>>>>>>> int main(int argc, char** argv) {
>>>>>>>>>>>>> int sum=0; int a=atoi(argv[1]); int b=atoi(argv[2]);
>>>>>>>>>>>>> for (int i=0;i<1000;i++)
>>>>>>>>>>>>> {
>>>>>>>>>>>>> sum+=a+b;
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> printf("sum: %d\n", sum);
>>>>>>>>>>>>> return 0;
>>>>>>>>>>>>> }
>>>>>>>>>>>>> i use following commands:
>>>>>>>>>>>>> clang  -S -emit-llvm sum-main.c -march=knl -O3 -mllvm
>>>>>>>>>>>>> -disable-llvm-optzns -o sum-main.ll
>>>>>>>>>>>>> opt  -S -O3 -force-vector-width=64 sum-main.ll -o sum-main03.ll
>>>>>>>>>>>>>
>>>>>>>>>>>>> why is that so? where am i doing mistake? i am not getting
>>>>>>>>>>>>> vectorized operations rather getting scalar operations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please help.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank You
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170817/ca8a79b4/attachment.html>


More information about the llvm-dev mailing list