[LLVMdev] Vectorizing alloca instructions

Thu Oct 24 15:30:59 PDT 2013

On Thu, Oct 24, 2013 at 03:16:52PM -0700, Nadav Rotem wrote:
> 
> On Oct 24, 2013, at 3:00 PM, Chandler Carruth <chandlerc at google.com> wrote:
> 
> > Just a note, I don't think you should or need to vectorize the actual alloca stuff. If you can simply transform the dynamically indexed load:
> > 
> > Then running SROA and InstCombine will mop up the rest. So its mostly about getting the SLPVectorizer to handle the dynamic GEP. As soon as it does that, everything else will fall away.
> > 
> 
> I don?t think that Tom wants the SLP-vectorizer to handle dynamic GEPS.  What he wants is for the SLP-vectorizer to convert the first part of the code:
>

My goal is to eliminate all alloca instructions, so vectorizing the
stores is just the first step.  I do want the dynamic GEPs removed and
replaced with dynamic extractelement instructions.  Is the
SLP-vectorizer the right place to do this?

-Tom

> 
> >> define void @vector(i32 addrspace(1)* %out, i32 %index) {
> >> entry:
> >>   %0 = alloca [4 x i32]
> >>   %x = getelementptr [4 x i32]* %0, i32 0, i32 0
> >>   %y = getelementptr [4 x i32]* %0, i32 0, i32 1
> >>   %z = getelementptr [4 x i32]* %0, i32 0, i32 2
> >>   %w = getelementptr [4 x i32]* %0, i32 0, i32 3
> >>   store i32 0, i32* %x
> >>   store i32 1, i32* %y
> >>   store i32 2, i32* %z
> >>   store i32 3, i32* %w
> > 
> 
> Into this:   Store <i32 0, i32 1, i32 2, i32 3> ....     
> 
> > 
> >>   %1 = bitcast [4 x i32]* %0 to <4 x i32>*
> >>   %2 = load <4 x i32>* %1
> >>   %3 = extractelement <4 x i32> %2, i32 %index
> >>   store i32 %3, i32 addrspace(1)* %out
> >>   ret void
> >> }
> > 
> 
>