[LLVMdev] Autovectorization questions
Arnold
aschwaighofer at apple.com
Wed Mar 12 17:40:21 PDT 2014
Yes, llvm will insert runtime checks.
> On Mar 12, 2014, at 5:26 PM, Raul Silvera <rsilvera at google.com> wrote:
>
> Even without wrapping around the end of the address space, without restrict you still have to worry about A and B overlapping on interesting ways. Will LLVM do some runtime dependence checks to discount such potential overlap? Just curious....
>
>
>> On Wed, Mar 12, 2014 at 5:01 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
>> Zinovy,
>>
>> to clarify: the code is vectorizable. But LLVM currently fails to prove it is.
>>
>> On Mar 12, 2014, at 3:50 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
>>
>> > In order to vectorize code like this LLVM needs to prove that “A[i*7]” does not wrap in the address space. It fails to do so and so LLVM doesn’t vectorize this loop even if we try to force it.
>> >
>> > The following loop will be vectorized if we force it:
>> >
>> > int foo(int * A, int * B, int n, int k) {
>> > for (int i = 0; i < 1024; ++i)
>> > A[i] += B[i*k];
>> > }
>> >
>> > So will this loop:
>> >
>> > int foo(int * restrict A, int * restrict B, int n, int k) {
>> > for (int i = 0; i < n; ++i)
>> > A[i] += B[i*k];
>> > }
>> >
>> > I will update the example.
>> >
>> > Thanks,
>> > Arnold
>> >
>> > On Mar 12, 2014, at 1:54 PM, Nadav Rotem <nrotem at apple.com> wrote:
>> >
>> >> Hi Zinovy,
>> >>
>> >> The loop vectorizer probably decided that it was not profitable to vectorize the function. You can force the vectorization of the function by setting a low threshold.
>> >>
>> >> Thanks,
>> >> Nadav
>> >>
>> >> On Mar 12, 2014, at 3:34 AM, Zinovy Nis <zinovy.nis at gmail.com> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I'm reading "http://llvm.org/docs/Vectorizers.html" and have few question. Hope someone has answers on it.
>> >>>
>> >>>
>> >>> The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions that scatter/gathers memory. (http://llvm.org/docs/Vectorizers.html#scatter-gather)
>> >>>
>> >>> int foo(int *A, int *B, int n, int k) {
>> >>> for (int i = 0; i < n; ++i)
>> >>> A[i*7] += B[i*k];
>> >>> }
>> >>>
>> >>> I replaced "int *A"/"int *B" into "double *A"/"double *B" and then compiled the sample with
>> >>>
>> >>> $> ./clang -Ofast -ffast-math test.c -std=c99 -march=core-avx2 -S -o bb.S -fslp-vectorize-aggressive
>> >>>
>> >>> and loop body looks like:
>> >>>
>> >>> .LBB1_2: # %for.body
>> >>> # =>This Inner Loop Header: Depth=1
>> >>> cltq
>> >>> vmovsd (%rsi,%rax,8), %xmm0
>> >>> movq %r9, %r10
>> >>> sarq $32, %r10
>> >>> vaddsd (%rdi,%r10,8), %xmm0, %xmm0
>> >>> vmovsd %xmm0, (%rdi,%r10,8)
>> >>> addq %r8, %r9
>> >>> addl %ecx, %eax
>> >>> decl %edx
>> >>> jne .LBB1_2
>> >>>
>> >>> so vector instructions for scalars (vaddsd, vmovsd) were used in the loop and no real gather/scatter emitted.
>> >>>
>> >>> The question is why this loop was not vectorized? Typo in docs?
>> >>>
>> >>> _______________________________________________
>> >>> LLVM Developers mailing list
>> >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >>
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> --
> Raúl E. Silvera
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140312/8426d420/attachment.html>
More information about the llvm-dev
mailing list