[LLVMdev] First attempt at recognizing pointer reduction
Arnold Schwaighofer
aschwaighofer at apple.com
Mon Oct 21 09:29:11 PDT 2013
Renato,
can you post a hand-created vectorized IR of how a reduction would work on your example?
I don’t think that recognizing this as a reduction is going to get you far. A reduction is beneficial if the value reduced is only truly needed outside of a loop.
This is not the case here (we are storing/loading from the pointer).
Your example is something like
WRITEPTR = phi i8* [ outsideval, preheader] [WPTR, loop]
store WRITEPTR
...
WPTR = getelementptr WRIPTR, 3
If you treat this as a reduction, you will end up with code like (say we used a VF=2):
WRITEPTR = phi <2 x i8*> [ <outsideval, 0>, preheader] [WPTR, loop]
ptr = WRITEPTR<0> + WRITEPTR<1>
store ptr
...
WPTR = getelementptr WRIPTR, <3,3>
I think the right approach to get examples like this is that we need to recognize strided pointer inductions (we only support strides by one currently).
Basically we need to be able to see
phi_ptr = phi double*
phi_ptr = gep phi_ptr, 2
and make a canonical induction variable out of this
phi_ind = phi 0, phi_ind
ptr = gep ptr_val_from_outside_loop, 2*phi_ind
phi_ind = add, phi_ind, 1
(of course virtually).
Or in other words:
for (i++)
*ptr++ =
*ptr++ =
has to look like
for (i++)
ptr[2*i] =
ptr[2*i+1] =
to us when we are vectorizing this.
Thanks,
Arnold
On Oct 21, 2013, at 10:23 AM, Renato Golin <renato.golin at linaro.org> wrote:
> Hi Nadav, Arnold,
>
> I managed to find some time to work on the pointer reduction, and I got a patch that can make "canVectorize()" pass.
>
> Basically what I do is to teach AddReductionVar() about pointers, saying they don't really have an exit instructions, and that (maybe) the final store is a good candidate (is it?).
>
> This makes it recognize the writes and reads, but then "canVectorizeMemory()" bails because it can't find the array bounds. This will be my next step, but I'd like to know if I'm in the right direction, with the right assumptions about the reduction detection, before proceeding into more complicated bits.
>
> An example of the debug output I get for my earlier example:
>
> LV: Checking a loop in "fn1"
> LV: Found a loop: for.body
>
> == Induction ==
>
> LV: Found an induction variable.
>
> == Our write reduction ==
>
> LV: Found a pointer add reduction PHI. %WRITE913 = phi i8* [ %incdec.ptr18, %for.body ], [ %WRITE, %for.body.preheader ]
>
> == Not sure what this is... Will check. ==
>
> LV: Found an non-int non-pointer PHI.
>
> == Our read reduction ==
>
> LV: Found a pointer add reduction PHI. %READ1012 = phi i8* [ %incdec.ptr2, %for.body ], [ %READ, %for.body.preheader ]
>
> == Below are the memory operations's validation, which I'll deal with later ==
>
> LV: Found an unidentified write ptr: i8* %WRITE
> LV: Found an unidentified write ptr: i8* %WRITE
> LV: Found an unidentified write ptr: i8* %WRITE
> LV: Found an unidentified read ptr: i8* %READ
> LV: Found an unidentified read ptr: i8* %READ
> LV: Found an unidentified read ptr: i8* %READ
> LV: Found a runtime check ptr: %WRITE913 = phi i8* [ %incdec.ptr18, %for.body ], [ %WRITE, %for.body.preheader ]
> LV: Found a runtime check ptr: %incdec.ptr16 = getelementptr inbounds i8* %WRITE913, i64 1
> LV: Found a runtime check ptr: %incdec.ptr17 = getelementptr inbounds i8* %WRITE913, i64 2
> LV: Found a runtime check ptr: %READ1012 = phi i8* [ %incdec.ptr2, %for.body ], [ %READ, %for.body.preheader ]
> LV: Found a runtime check ptr: %incdec.ptr = getelementptr inbounds i8* %READ1012, i64 1
> LV: Found a runtime check ptr: %incdec.ptr1 = getelementptr inbounds i8* %READ1012, i64 2
> LV: We need to do 18 pointer comparisons.
> LV: We can't vectorize because we can't find the array bounds.
>
> cheers,
> --renato
> <pointer-reduction.patch>
More information about the llvm-dev
mailing list