[PATCH] D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 16 09:26:05 PDT 2021


dmgreen added inline comments.


================
Comment at: llvm/include/llvm/Analysis/IVDescriptors.h:257
   bool IsSigned = false;
+  // If this is an ordered reduction
+  bool IsOrdered = false;
----------------
david-arm wrote:
> david-arm wrote:
> > spatel wrote:
> > > dmgreen wrote:
> > > > Am I correct in saying that Ordered reductions are a subset of floating point reductions that
> > > > - don't have AllowReassoc on the fadd
> > > > - have only a single fadd added in each loop iteration (but possibly predicated to be one of several additions). So either fadd(redphi(..)) or phi(fadd(redphi), redphi) or phi(fadd(redphi), fadd(redphi)) ?
> > > > 
> > > > And that we can't easily get this info from just the FMF or the ExactFPMathInst?
> > > I have the same question. I was looking at this code recently ( 36a489d19475 , 1bee549737ac ) -- but I'm still not sure if we are behaving correctly or optimally. 
> > > 
> > > The "ExactFPMathInst" seems to provide the same information. I experimented with changing that to a bool. The only reason we appear to save an instruction pointer rather than a bool is so we can provide a prettier optimization remark in the case a loop is not vectorized. Ie, we say something like: "The FP math at line 42 is not associative" rather than "The loop starting at line 39 requires exact FP math".
> > I think the IsOrdered flag here is more of a convenience so that we can avoid calling the more expensive `checkOrderedReduction` function every time we may want to use strict, in-order reduction intrinsics. The `checkOrderedReduction` does cast the instruction to a `FPMathOperator` and look for the allows-reassoc flag.
> Hi @spatel, oh I see now what you mean. I didn't realise we now had a ExactFPMathInst in RecurrenceDescriptor. It looks like you're saying instead of setting the IsOrdered flag we can just set ExactFPMathInst to the instruction in ReduxDesc, which then gets passed on to the RecurrenceDescriptor.
As far as I understand, we still can not vectorize in-order reductions with multiple adds in the loop. Something like https://godbolt.org/z/c9qd1v. The ordering would change if we tried.

So the flag may still be needed for the second bullet point above, which is a large part of the checkOrderedReduction. Either that or a way to compute that there is only a single fadd (possibly through a select/if block phi).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98435/new/

https://reviews.llvm.org/D98435



More information about the llvm-commits mailing list