[LLVMdev] Autovectorization questions

Wed Mar 12 17:26:43 PDT 2014

Even without wrapping around the end of the address space, without restrict
you still have to worry about A and B overlapping on interesting ways. Will
LLVM do some runtime dependence checks to discount such potential overlap?
Just curious....

On Wed, Mar 12, 2014 at 5:01 PM, Arnold Schwaighofer <
aschwaighofer at apple.com> wrote:

> Zinovy,
>
> to clarify: the code is vectorizable. But LLVM currently fails to prove it
> is.
>
> On Mar 12, 2014, at 3:50 PM, Arnold Schwaighofer <aschwaighofer at apple.com>
> wrote:
>
> > In order to vectorize code like this LLVM needs to prove that “A[i*7]”
> does not wrap in the address space. It fails to do so and so LLVM doesn’t
> vectorize this loop even if we try to force it.
> >
> > The following loop will be vectorized if we force it:
> >
> > int foo(int * A, int * B, int n, int k) {
> >  for (int i = 0; i < 1024; ++i)
> >    A[i] += B[i*k];
> > }
> >
> > So will this loop:
> >
> > int foo(int * restrict A, int * restrict B, int n, int k) {
> >  for (int i = 0; i < n; ++i)
> >    A[i] += B[i*k];
> > }
> >
> > I will update the example.
> >
> > Thanks,
> > Arnold
> >
> > On Mar 12, 2014, at 1:54 PM, Nadav Rotem <nrotem at apple.com> wrote:
> >
> >> Hi Zinovy,
> >>
> >> The loop vectorizer probably decided that it was not profitable to
> vectorize the function. You can force the vectorization of the function by
> setting a low threshold.
> >>
> >> Thanks,
> >> Nadav
> >>
> >> On Mar 12, 2014, at 3:34 AM, Zinovy Nis <zinovy.nis at gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I'm reading "http://llvm.org/docs/Vectorizers.html" and have few
> question. Hope someone has answers on it.
> >>>
> >>>
> >>> The Loop Vectorizer can vectorize code that becomes a sequence of
> scalar instructions that scatter/gathers memory. (
> http://llvm.org/docs/Vectorizers.html#scatter-gather)
> >>>
> >>> int foo(int *A, int *B, int n, int k) {
> >>>  for (int i = 0; i < n; ++i)
> >>>    A[i*7] += B[i*k];
> >>> }
> >>>
> >>> I replaced "int *A"/"int *B" into "double *A"/"double *B" and then
> compiled the sample with
> >>>
> >>> $> ./clang -Ofast -ffast-math test.c -std=c99 -march=core-avx2 -S -o
> bb.S  -fslp-vectorize-aggressive
> >>>
> >>> and loop body looks like:
> >>>
> >>> .LBB1_2:                                # %for.body
> >>>                                        # =>This Inner Loop Header:
> Depth=1
> >>>        cltq
> >>>        vmovsd  (%rsi,%rax,8), %xmm0
> >>>        movq    %r9, %r10
> >>>        sarq    $32, %r10
> >>>        vaddsd  (%rdi,%r10,8), %xmm0, %xmm0
> >>>        vmovsd  %xmm0, (%rdi,%r10,8)
> >>>        addq    %r8, %r9
> >>>        addl    %ecx, %eax
> >>>        decl    %edx
> >>>        jne     .LBB1_2
> >>>
> >>> so vector instructions for scalars (vaddsd, vmovsd) were used in the
> loop and no real gather/scatter emitted.
> >>>
> >>> The question is why this loop was not vectorized? Typo in docs?
> >>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

-- 
 Raúl E. Silvera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140312/86527c55/attachment.html>