[PATCH] D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 25 13:56:38 PDT 2021


dmgreen added a comment.

Thanks. This LGTM, if there are no other comments.



================
Comment at: llvm/include/llvm/Analysis/IVDescriptors.h:257
   bool IsSigned = false;
+  // If this is an ordered reduction
+  bool IsOrdered = false;
----------------
kmclaughlin wrote:
> dmgreen wrote:
> > david-arm wrote:
> > > david-arm wrote:
> > > > spatel wrote:
> > > > > dmgreen wrote:
> > > > > > Am I correct in saying that Ordered reductions are a subset of floating point reductions that
> > > > > > - don't have AllowReassoc on the fadd
> > > > > > - have only a single fadd added in each loop iteration (but possibly predicated to be one of several additions). So either fadd(redphi(..)) or phi(fadd(redphi), redphi) or phi(fadd(redphi), fadd(redphi)) ?
> > > > > > 
> > > > > > And that we can't easily get this info from just the FMF or the ExactFPMathInst?
> > > > > I have the same question. I was looking at this code recently ( 36a489d19475 , 1bee549737ac ) -- but I'm still not sure if we are behaving correctly or optimally. 
> > > > > 
> > > > > The "ExactFPMathInst" seems to provide the same information. I experimented with changing that to a bool. The only reason we appear to save an instruction pointer rather than a bool is so we can provide a prettier optimization remark in the case a loop is not vectorized. Ie, we say something like: "The FP math at line 42 is not associative" rather than "The loop starting at line 39 requires exact FP math".
> > > > I think the IsOrdered flag here is more of a convenience so that we can avoid calling the more expensive `checkOrderedReduction` function every time we may want to use strict, in-order reduction intrinsics. The `checkOrderedReduction` does cast the instruction to a `FPMathOperator` and look for the allows-reassoc flag.
> > > Hi @spatel, oh I see now what you mean. I didn't realise we now had a ExactFPMathInst in RecurrenceDescriptor. It looks like you're saying instead of setting the IsOrdered flag we can just set ExactFPMathInst to the instruction in ReduxDesc, which then gets passed on to the RecurrenceDescriptor.
> > As far as I understand, we still can not vectorize in-order reductions with multiple adds in the loop. Something like https://godbolt.org/z/c9qd1v. The ordering would change if we tried.
> > 
> > So the flag may still be needed for the second bullet point above, which is a large part of the checkOrderedReduction. Either that or a way to compute that there is only a single fadd (possibly through a select/if block phi).
> Hi @dmgreen & @spatel, I also didn't realise that ExactFPMathInst is very similar and already checks the hasAllowReassoc flag. I've left the IsOrdered flag in this patch as I think there is still a need for it in the case mentioned above where there is a chain of fadds (I've added a test for this to `strict-fadd.ll`). I also changed the conditions in checkOrderedReduction slightly, to check if `Exit == ExactFPMathInst`.
Can we clarify the comment, to specify that it means this reduction can be treated like an inorder reduction, and what that currently means.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98435/new/

https://reviews.llvm.org/D98435



More information about the llvm-commits mailing list