[PATCH] D110235: [LoopVectorize] Support reductions that store intermediary result

Igor Kirillov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 29 11:09:19 PST 2021


igor.kirillov marked 5 inline comments as done.
igor.kirillov added inline comments.


================
Comment at: llvm/lib/Analysis/IVDescriptors.cpp:301
+  //  - By store instructions with a loop invariant address (safe).
+  //      * If there are several stores, all must have the same address.
   //  - By an instruction that is not part of the reduction (not safe).
----------------
fhahn wrote:
> igor.kirillov wrote:
> > fhahn wrote:
> > > What about existing users of `isReductionPHI` which currently may rely on the fact that all instruction in the loop must either be phis or reduction operations? Also, with respect to the store restriction, the important bit is that the final value is also stored, right?
> > I checked and only `LoopInterchangeLegality` is using this function and it is not affected by stores. Anyway, I can add a parameter to `RecurrenceDescriptor::isReductionPHI` or a member to `RecurrenceDescriptor` allowing or not to handle stores. What do you think about it?
> > 
> > The comment is to be updated, yes.
> I guess it would be good to have at least a test case for loop-interchange to make sure it can deal with the change properly?
> 
You are right. We actually should not do loop interchange is an invariant address is present. Otherwise it may introduce incorrect result.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:900
+    // of one of our reductions and is unconditional.
+    for (StoreInst *SI : LAI->getInvariantStores()) {
+      if (isRecurringInvariantStore(SI, &IsPredicated)) {
----------------
fhahn wrote:
> igor.kirillov wrote:
> > fhahn wrote:
> > > igor.kirillov wrote:
> > > > fhahn wrote:
> > > > > What about loads to the same address in the loop? At the moment, `LAA` cannot analyze dependences with invariant addresses. But if this limitation gets removed the code here may become incorrect, because it relies on this limitation IIUC?
> > > > Loads from or stores to the same address in the loop? I'm sorry could you clarify what the problem is. As it is I don't understand the message.
> > > The case I was thinking about was something like the snippet below, where we have a load of the invariant address in the loop (`%lv = load...` in the example below).
> > > 
> > > ```
> > > define void @reduc_store(i32* %dst, i32* readonly %src, i32* noalias %dst.2) {
> > > entry:
> > >   %arrayidx = getelementptr inbounds i32, i32* %dst, i64 42
> > >   store i32 0, i32* %arrayidx, align 4
> > >   br label %for.body
> > > for.body:
> > >   %0 = phi i32 [ 0, %entry ], [ %add, %for.body ]
> > >   %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
> > >   %arrayidx1 = getelementptr inbounds i32, i32* %src, i64 %indvars.iv
> > >   %1 = load i32, i32* %arrayidx1, align 4
> > >   %add = add nsw i32 %0, %1
> > >   %lv = load i32, i32* %arrayidx
> > >   store i32 %add, i32* %arrayidx, align 4
> > >   
> > >   %gep.dst.2  = getelementptr inbounds i32, i32* %dst.2, i64 %indvars.iv
> > >   store i32 %lv, i32* %gep.dst.2,
> > > 
> > >   %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
> > >   %exitcond = icmp eq i64 %indvars.iv.next, 1000
> > >   br i1 %exitcond, label %for.cond.cleanup, label %for.body
> > > 
> > > for.cond.cleanup:
> > >   ret void
> > > }
> > > ```
> > In that case we do not vectorize, the rejection happens in `LoopAccessInfo::analyzeLoop` when loads ape processed:
> > 
> > ```
> >   for (LoadInst *LD : Loads) {
> >     Value *Ptr = LD->getPointerOperand();
> > ...
> >     // See if there is an unsafe dependency between a load to a uniform address and
> >     // store to the same uniform address.
> >     if (UniformStores.count(Ptr)) {
> >       LLVM_DEBUG(dbgs() << "LAA: Found an unsafe dependency between a uniform "
> >                            "load and uniform store to the same address!\n");
> >       HasDependenceInvolvingLoopInvariantAddress = true;
> >     }
> > ...
> > ```
> > 
> > I added `reduc_store_load` test anyway
> Yeah but this is only due to some limitations that currently existing in LAA, right? I think we should at least make this clear somewhere, e.g. in a comment.
Added comment


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110235/new/

https://reviews.llvm.org/D110235



More information about the llvm-commits mailing list