[LLVMdev] Vectorization of pointer PHI nodes

Mon Oct 14 09:44:52 PDT 2013

Hi Renato, 

Thanks for working on this.  As you said, we don't support pointer reductions.  Handling pointer reductions should be straightforward. You can copy the logic for handling RK_IntegerAdd and create a new enum entry for RK_PointerAdd.  You will need to detect the relevant patterns (GEP probably) and implement the cost model and vectorization parts.  You will need to generate vector-geps that represent the pointer increment. Vector-GEPs are rare and this may trigger bugs in other parts of the compiler (Vector GEPs usually live between the vectorizer, which is a very late pass, until SelectionDAG builder). I wonder if this pattern is common. Usually before I add new features to the vectorizer I write code that does an “fprintf” into a file every time I detect the pattern that I want to optimize.  Next, I run the LLVM test suite and look at the C files that triggered the patterns. This helps in estimating the profitability of the optimization and also to find additional interesting cases. 

Thanks,
Nadav

On Oct 14, 2013, at 9:29 AM, Renato Golin <renato.golin at linaro.org> wrote:

> Hi Nadav, Arnold, (now copying LLVM-dev, not commits),
> 
> I'm working on an example of a code that current GCC trunk can vectorize but LLVM cannot, and after analysing the intermediate IR just before the loop vectorization and by stepping through the process, I can see that the vectorizer only recognizes integer and float reduction variables, not pointers.
> 
> My code looks like this:
> 
> for (i: 0 -> MAX) {
>   a = *read;
>   b = *read;
>   c = *read;
> 
>   // do the same stuff to a, b, c
> 
>   *write++ = a;
>   *write++ = b;
>   *write++ = c;
> }
> 
> Vectorizing this is very simple and it's a sequence of VLD3 + VOPS + VST3, which GCC does it nicely, but we don't.
> 
> What would be the steps in adding a pointer increment reduction kind (RK_PointerInc)? I believe the logic would be similar to RK_IntegerAdd, but with a stride of type size, right?
> 
> Or maybe we'd have to translate the loop into something like:
> 
> for (i: 0 -> MAX, +=3) {
>   write[i] = (op) read[i];
>   write[i+1] = (op) read[i+1];
>   write[i+2] = (op) read[i+2];
> }
> 
> So that the reduction variable gets recognized?
> 
> cheers,
> --renato
>