[llvm-dev] SLP example not being vectorized

Fri Nov 29 07:31:44 PST 2019

Thanks Adrien, I did not realize the integer division has no vector 
instruction. IMO, we should provide a better code snipped in this 
section, indicating a 'simpler' arithmetic expression that would work 
(in terms of SLP vectorization) in most architectures at indicate the 
-march parameter. I will propose it and see the comments.

lsg

On 11/28/2019 12:43 PM, Adrien Guinet via llvm-dev wrote:
> On 11/28/19 6:45 PM, Sandoval Gonzalez, Leonardo via llvm-dev wrote:
>> Hi,
>>
>> I am new to llvm with a particular interested in the optimization area,
>> specially on SLP. While working through the tutorial, I ran this example
>> [1] with the hope to see SLP vectorization in action but for some
>> reason, I do not see it on the LLVM assembly as seen below. Is there
>> anything I am missing? I am using Clearlinux as build machine and this
>> has clang version 9.0.0.
>>
>
> If you're on Intel hardware, I'd say that AFAIK there is no vectorial
> integer division instruction, so LLVM won't vectorize this code.
> Moreover, you should specify which instruction set you want to use,
> whether by specifying the CPU architecure with -march, or by activating
> various instruction sets by "hand" (e.g. with -mavx2).
>
> For instance, the SLP vectorizer will work here:
>
> $ cat a.c
> void foo(int a1, int a2, int b1, int b2, int *A) {
>    A[0] = a1+a2;
>    A[1] = b1+b2;
>    A[2] = a1+b2;
>    A[3] = a2+b1;
>    A[4] = a2+a2;
>    A[5] = b2+b2;
>    A[6] = a2+b2;
>    A[7] = a1+b1;
> }
>
> $ clang-9 -S -emit-llvm -O3 -march=native -o - a.c
> define dso_local void @foo(i32, i32, i32, i32, i32* nocapture)
> local_unnamed_addr #0 {
>   [...]
>    %20 = add nsw <8 x i32> %13, %19
>    [...]
> }
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev