[LLVMdev] Autovectorization questions

Wed Mar 12 17:40:21 PDT 2014

Yes, llvm will insert runtime checks.

> On Mar 12, 2014, at 5:26 PM, Raul Silvera <rsilvera at google.com> wrote:
> 
> Even without wrapping around the end of the address space, without restrict you still have to worry about A and B overlapping on interesting ways. Will LLVM do some runtime dependence checks to discount such potential overlap? Just curious....
> 
> 
>> On Wed, Mar 12, 2014 at 5:01 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
>> Zinovy,
>> 
>> to clarify: the code is vectorizable. But LLVM currently fails to prove it is.
>> 
>> On Mar 12, 2014, at 3:50 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
>> 
>> > In order to vectorize code like this LLVM needs to prove that “A[i*7]” does not wrap in the address space. It fails to do so and so LLVM doesn’t vectorize this loop even if we try to force it.
>> >
>> > The following loop will be vectorized if we force it:
>> >
>> > int foo(int * A, int * B, int n, int k) {
>> >  for (int i = 0; i < 1024; ++i)
>> >    A[i] += B[i*k];
>> > }
>> >
>> > So will this loop:
>> >
>> > int foo(int * restrict A, int * restrict B, int n, int k) {
>> >  for (int i = 0; i < n; ++i)
>> >    A[i] += B[i*k];
>> > }
>> >
>> > I will update the example.
>> >
>> > Thanks,
>> > Arnold
>> >
>> > On Mar 12, 2014, at 1:54 PM, Nadav Rotem <nrotem at apple.com> wrote:
>> >
>> >> Hi Zinovy,
>> >>
>> >> The loop vectorizer probably decided that it was not profitable to vectorize the function. You can force the vectorization of the function by setting a low threshold.
>> >>
>> >> Thanks,
>> >> Nadav
>> >>
>> >> On Mar 12, 2014, at 3:34 AM, Zinovy Nis <zinovy.nis at gmail.com> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I'm reading "http://llvm.org/docs/Vectorizers.html" and have few question. Hope someone has answers on it.
>> >>>
>> >>>
>> >>> The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions that scatter/gathers memory. (http://llvm.org/docs/Vectorizers.html#scatter-gather)
>> >>>
>> >>> int foo(int *A, int *B, int n, int k) {
>> >>>  for (int i = 0; i < n; ++i)
>> >>>    A[i*7] += B[i*k];
>> >>> }
>> >>>
>> >>> I replaced "int *A"/"int *B" into "double *A"/"double *B" and then compiled the sample with
>> >>>
>> >>> $> ./clang -Ofast -ffast-math test.c -std=c99 -march=core-avx2 -S -o bb.S  -fslp-vectorize-aggressive
>> >>>
>> >>> and loop body looks like:
>> >>>
>> >>> .LBB1_2:                                # %for.body
>> >>>                                        # =>This Inner Loop Header: Depth=1
>> >>>        cltq
>> >>>        vmovsd  (%rsi,%rax,8), %xmm0
>> >>>        movq    %r9, %r10
>> >>>        sarq    $32, %r10
>> >>>        vaddsd  (%rdi,%r10,8), %xmm0, %xmm0
>> >>>        vmovsd  %xmm0, (%rdi,%r10,8)
>> >>>        addq    %r8, %r9
>> >>>        addl    %ecx, %eax
>> >>>        decl    %edx
>> >>>        jne     .LBB1_2
>> >>>
>> >>> so vector instructions for scalars (vaddsd, vmovsd) were used in the loop and no real gather/scatter emitted.
>> >>>
>> >>> The question is why this loop was not vectorized? Typo in docs?
>> >>>
>> >>> _______________________________________________
>> >>> LLVM Developers mailing list
>> >>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >>
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> 
> -- 
>  Raúl E. Silvera 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140312/8426d420/attachment.html>