[LLVMdev] Auto-vectorization and phi nodes

Tue Feb 19 07:51:11 PST 2013

----- Original Message -----
> From: "Vesa Norilo" <vnorilo at siba.fi>
> To: llvmdev at cs.uiuc.edu
> Sent: Tuesday, February 19, 2013 4:40:26 AM
> Subject: [LLVMdev] Auto-vectorization and phi nodes
> 
> Hi all,
> 
> Sorry if this is a dumb or FAQ or the wrong list!
> 
> I'm currently investigating LLVM vectorization of my generated code.
> My
> codegen emits a lot of recursions that step through arrays via
> pointers.
> The recursions are nicely optimized into loops, but the loop
> vectorization can't seem to work on them because of phi nodes that
> point
> to gep nodes.
> 
> Some simple IR to demonstrate; it vectorizes nicely with opt -O3
> -vectorize-loops -force-vector-width until I uncomment the phi/gep
> nodes.
> 
> define void @add_vector(float* noalias %a, float* noalias %b, float*
> noalias %c, i32 %num)
> {
> Top:
>          br label %Loop
> Loop:
>          %i = phi i32 [0,%Top],[%i.next,%Loop]
> 
> ;        phi and gep - won't vectorize
> ;        %a.ptr = phi float* [%a,%Top],[%a.next,%Loop]
> ;        %b.ptr = phi float* [%b,%Top],[%b.next,%Loop]
> ;        %c.ptr = phi float* [%c,%Top],[%c.next,%Loop]
> 
> ;        %a.next = getelementptr float* %a.ptr, i32 1
> ;        %b.next = getelementptr    float* %b.ptr, i32 1
> ;        %c.next = getelementptr float* %c.ptr, i32 1
> 
> ;        induction variable as index - will vectorize
>          %a.ptr = getelementptr float* %a, i32 %i
>          %b.ptr = getelementptr float* %b, i32 %i
>          %c.ptr = getelementptr float* %c, i32 %i
> 
>          %a.val = load float* %a.ptr
>          %b.val = load float* %b.ptr
>          %sum = fadd float %a.val, %b.val
>          store float %sum, float* %c.ptr
> 
>          %i.next = add i32 %i, 1
>          %more = icmp slt i32 %i.next, %num
>          br i1 %more, label %Loop, label %End
> End:
>          ret void
> }
> 
> So it seems that the loop vectorizer would like the pointer stepping
> to
> be converted to base+index. However as expected, clang doesn't care
> whether C code is written as pointer arithmetic or table index.
> 
> Is there a pass that converts simple pointer arithmetic to
> base+index?

As I recall, loop strength reduction can do this; but that happens only very late in the compilation process (well after vectorization). It would probably be better to update the loop vectorizer to deal with this directly. Nadav?

 -Hal

> If not, should I write one (shouldn't be too hard for my limited use
> case) or try to emit more vector-friendly code from the front end?
> 
> Thanks a bunch!
> Vesa Norilo
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>