[PATCH] D25276: [LoopVectorizer] Interleaved-mem-accesses analysis and getPtrStride

Dorit Nuzman via llvm-commits llvm-commits at lists.llvm.org
Sun Oct 23 02:45:23 PDT 2016


dorit added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5872
+        continue;
+      Value *Ptr = getPointerOperand(Member);
+      int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, /*Assume=*/false,
----------------
sbaranga wrote:
> dorit wrote:
> > dorit wrote:
> > > sbaranga wrote:
> > > > I think there is another case here when we don't need to check the wrapping: if the group has the first and last pointers, and we know they both don't wrap.
> > > > 
> > > > We should be able to apply this for versioning later and have at most two no-wrap checks per interleaved group.
> > > So, trying to summarize:
> > > 1) Initial check with Assume=true, ShouldCheckWrap=false;
> > > 2) Revisit only groups with gaps as follows: 
> > > - if first and last elements are accessed: skip further checking.
> > > - otherwise – check the first and last pointers of the group with Assume=false (for now), ShouldCheckWrap=true;
> > > 
> > > Silviu, is that what you meant?
> > > 
> > 3 and 4 above were meant to be sub-bullets of item 2...:
> > 
> > 1.Initial check with Assume=true, ShouldCheckWrap=false;
> > 2.Revisit only groups with gaps as follows:
> >    - if first and last elements are accessed: skip further checking.
> >    - otherwise – check the first and last pointers of the group with Assume=false (for now), ShouldCheckWrap=true;
> > 
> I agree with point 1)
> For point 2) I meant something slightly different:
>   - if first and last elements are accessed *and* we know they don't wrap, skip further checking (because it implies that all the pointers in the group don't wrap).
>   - if the group is full, skip further checking
>   - otherwise (as you said) check the first and last pointers of the group with Assume=false (for now), ShouldCheckWrap=true;
> 
> I think optimizing this is a bit open-ended, but we could definitely go further (feel free to omit this for this commit):
> 
> If we are peeling the loop and we know that the first pointer doesn't wrap then we can deduce that all pointers in the group don't wrap.
> 
> This means that we can forcefully peel the loop in order to only have to check the first pointer for no-wrap. When 
> we'll change to use Assume=true,ShouldCheckWrap=true we'll only need at most one runtime check per interleaved group (I think this is probably optimal).
about “*and* we know they don't wrap”….:

1)	exactly what do you mean by “know they don’t wrap”? that calling getPtrStride with  Assume=false, ShouldCheckWrap=true is successful?

2)	I would think that “group is full” and “first and last elements are accessed” are equivalent in terms of the effect of the transformation on creating new potentially-wrapping accesses (in both cases the transformation will not make things worse than the original code in that respect); we shouldn’t care if the accesses to the first/last elements wrap or not because the wide loads that we’ll create will have the same wrap-around behavior as the original code, no? (in short, why the extra "*and* we know they don't wrap"...?)

(this is assuming that potential wrapping that may be caused by extra accesses in the vector code created due to misalignment are accounted for else-where). 




================
Comment at: llvm/test/Transforms/LoopVectorize/interleaved-accesses-2.ll:16
+; with proper threshold checks.
+; XFAIL: *
+
----------------
sbaranga wrote:
> dorit wrote:
> > sbaranga wrote:
> > > mssimpso wrote:
> > > > I would prefer that we not XFAIL a new test. Unless Silviu disagrees with letting Assume=true, I think it would be better to check the actual behavior while keeping the FIXME.
> > > > 
> > > > Would we be missing a test though? If Assume=true here, I don't think we will be checking the case that we actually remove a group that is may wrap?
> > > I think having Assume=true for the initial check should be fine.
> > Right; 
> > 
> > Silviu, the example you gave earlier for a strided access that requires Assume=true: How would you turn it into a test that doesn't fail  vectorization on "SCEV could not compute the loop exit count"...? 
> I've tried the following function, and you do have to have assume=true for it in order to form interleaved groups
> 
>  void func(unsigned * restrict a, unsigned * restrict b, unsigned char x, unsigned char y) {
>      int i = 0;
>      for (unsigned char index = x; i < y; index +=2, ++i)
>        b[i] = a[index] * 2;
>    
>    }
perfect. thanks!!


https://reviews.llvm.org/D25276





More information about the llvm-commits mailing list