[PATCH] D36113: [Loop Vectorize] Vectorize Loops with Backward Dependence

Tue Aug 15 08:05:34 PDT 2017

dberlin added a comment.

The

================
Comment at: lib/Transforms/Vectorize/LoopVectorizePred.cpp:187
+      if (MSSA->isLiveOnEntryDef(D))
+        Reordered |= checkDepAndReorder(St, Ld);
+      else if (Instruction *DefInst = dyn_cast<Instruction>(D)) {
----------------
mssimpso wrote:
> dberlin wrote:
> > DIVYA wrote:
> > > hfinkel wrote:
> > > > I'm somewhat concerned that a number of these cases are actually handling loop-invariant values, meaning that we're doing a suboptimal handling of something that LICM should handle more fully. The problem is that dealing with these after vectorization could be more difficult than before, and if we have a phase-ordering problem such that we're missing these cases, we might just end up with suboptimal code. Are you seeing these cases in practice?
> > > > 
> > > 
> > > I think the loop invariant codes will be already hoisted outside the loop before this pass.The Defining access will be liveOnEntry for  the loads  and stores from pointer arguments .
> > > For example in the below code,  the defining access for the load Instruction for a[i+1], is Live on Entry .However the load instruction is not loop invariant.
> > > 
> > > For the case
> > > int  foo1(int n,int * restrict a, int * restrict b, int *restrict m){
> > >   int i;
> > >   for (i = 0; i < n; i++){
> > >     a[i] = b[i];
> > >     m[i] = a[i+1];
> > >   }
> > > }
> > > 
> > > 
> > I'm surprised it can prove that it does not conflict with the store.
> > The load instruction actually is loop invariant in the sense of whether a given loaded address can change during the loop, it just takes on different loop invariant values each iteration.
> > 
> > It is possible to hoist it out of this loop, it just requires making a different loop :)
> > 
> > In particular, this generates the same result:
> > 
> > ```
> > int foo1(int n,int * restrict a, int * restrict b, int *restrict m){
> > 
> > int i;
> > for (i = 0; i < n; i++){
> >   m[i] = a[i+1];
> > }
> > for (i = 0; i < n; i++){
> >   a[i] = b[i];
> > }
> > }
> > ```
> I wonder if our loop distribution pass (disabled by default) can handle this case?
I also wonder if one of our passes could ever transform it to memcpy, since it's just a memcpy(m, a+1, n*sizeof(a))

https://reviews.llvm.org/D36113