[PATCH] D105432: [Analysis] Add simple cost model for strict (in-order) reductions

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 19 00:28:35 PDT 2021


sdesmalen added inline comments.


================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:1152
+  static bool isOrderedReduction(unsigned Opcode, FastMathFlags FPFlags) {
+    if ((Opcode == Instruction::FAdd || Opcode == Instruction::FMul) &&
+        !FPFlags.allowReassoc())
----------------
Why does this need to check the opcode?


================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:1184
   InstructionCost getArithmeticReductionCost(
-      unsigned Opcode, VectorType *Ty,
+      unsigned Opcode, VectorType *Ty, FastMathFlags FPFlags,
       TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput) const;
----------------
nit: Not sure if it's worth it, but should you make FPFlags a default argument? That may simplify some of the calls where the flags are unnecessary.


================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:1170
 
+  enum class ReductionType : uint8_t { Split, Ordered };
+
----------------
david-arm wrote:
> dmgreen wrote:
> > david-arm wrote:
> > > dmgreen wrote:
> > > > sdesmalen wrote:
> > > > > david-arm wrote:
> > > > > > david-arm wrote:
> > > > > > > dmgreen wrote:
> > > > > > > > david-arm wrote:
> > > > > > > > > dmgreen wrote:
> > > > > > > > > > sdesmalen wrote:
> > > > > > > > > > > RKSimon wrote:
> > > > > > > > > > > > Please can you add doxygen description comments for these?
> > > > > > > > > > > nit: may I request this (and any other references to it) to be renamed to TreeWise? In my experience that's a more common term used with these kinds of reductions.
> > > > > > > > > > Could we represent this by passing FastMathFlags, not a new enum? The fast math flags are what dictates in-order vs relaxed for example. And I don't believe that an integer "Ordered" reduction is ever different from a un-ordered reduction.
> > > > > > > > > I guess we could do that, but the flags contain a lot more than just the reassoc flag, which is the only thing we care about here. I personally think it seems a bit more obvious to the caller what's going on if we specify the type directly. Not sure what others think?
> > > > > > > > That would be my point really, that it contains more info. Why invent a new enum when the fast math flags already represent all the information we need here. It is the fundamental information we require here, so why not use it directly?
> > > > > > > > 
> > > > > > > > We are trying to get the cost of a reduction intrinsic, and that intrinsic has some FMFlags. Those flags may dictate the cost of the resulting code, so the FMF's should be passed to the cost function.
> > > > > > > > 
> > > > > > > > fmin/fmax could be in the same situations. It could be possible (although I'm not sure it occurs anywhere at the moment) that a "no-nan" fmax reduction has a different cost to one without the flag.
> > > > > > > Would there ever be a case where we didn't have the fast math flags, but wanted to calculate a theoretical reduction cost anyway? Then you'd have to explicitly create a FastMathFlags object and have to know exactly which flag needs setting in order to get the desired cost? For example, the user would have to know to set the reassoc flag in order to get the tree-based reduction cost. I mean, not saying that's a bad thing necessarily, but it felt a bit less intuitive and would certainly require additional comments above the `getArithmeticReductionCost` function saying how the flags map precisely to the cost.
> > > > > > > 
> > > > > > > Would there potentially be value in having an overloaded interface that maps one to the other (FastMathFlags<>Enum), or do you strongly prefer a single interface that takes FastMathFlags?
> > > > > > > 
> > > > > > > I wonder if other reviewers have a preference here?
> > > > > > I forgot to mention - just for reference I think the fmin/fmax cases go through a different cost function - getMinMaxReductionCost
> > > > > Now that `PairWise` has been removed, I'd expect we'd only ever need to ask for the cost of either an ordered reduction or an unordered reduction ("fastest/most parallel way possible"). Would the fast-math flags ever need to specify anything else than those two options?
> > > > > 
> > > > > Integer operations have no FastMath flags, it seems awkward to have to construct them for an operation that doesn't support it, so I'd say the options are either to pass in a `bool IsOrdered`, or an `Optional<FastMathFlags>` which defaults to `None` (<=> implies 'unordered').
> > > > OK. Adding ReductionType feels like re-adding IsPairwise to me, which we just went and removed. The combination of Integer reduction + "Ordered" should still be a "TreeWise" reduction, right? There is no such things as an ordered integer reduction from codegen perspective. The code at the beginning of getArithmeticReductionCost should really be `if ((Opcode==FAdd || Opcode==FMul) && !FMF.allowReassoc()) ...`.
> > > > 
> > > > You can get empty FastMathFlags with just `FastMathFlags()` (and even give them a default value if needed)
> > > > I agree that min/max go through a different function - I think getMinMaxReductionCost should be updated with FMF too :)  According to the ExpandReduction pass they require nnan for performing a shuffle reduction on fmin/fmax. But that cost could depend on what instructions the target has available.
> > > > 
> > > > An enum/bool feels like the wrong interface to me, from an llvm perspective. But I don't hold that opinion strongly enough to disagree if folks all think that is a better way to go.
> > > I'm trying to see what happens when we always pass in FastMathFlags and it seems there are occasions when we have to construct the flags manually for FP instructions. See llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:getReductionCost:
> > > 
> > > ```    FastMathFlags FMF;
> > >     switch (RdxKind) {
> > >     case RecurKind::FAdd:
> > >     case RecurKind::FMul:
> > >       FMF.setAllowReassoc();
> > >       LLVM_FALLTHROUGH;
> > >     case RecurKind::Add:
> > >     case RecurKind::Mul:
> > >     case RecurKind::Or:
> > >     case RecurKind::And:
> > >     case RecurKind::Xor: {
> > >       unsigned RdxOpcode = RecurrenceDescriptor::getOpcode(RdxKind);
> > >       VectorCost = TTI->getArithmeticReductionCost(
> > >           RdxOpcode, VectorTy, FMF);
> > >       ScalarCost = TTI->getArithmeticInstrCost(RdxOpcode, ScalarTy);
> > >       break;
> > >     }```
> > > 
> > > This does work, but in this particular case it doesn't look as nice as before so I'm tempted to go with an Optional as @sdesmalen suggested.
> > Can you explain how Optional makes things better?
> > 
> > It looks like the SLP vectorizer has already computed `RdxFMF`, which are set in the builder. Otherwise they should in general come from the instruction the reduction is created from, which in this case would be from `FirstReducedVal`. It looks like the RdxFMF are exactly what is needed though, could it use those directly?
> So i just meant that with @sdesmalen 's suggestion of using `None` to imply unordered we could simply do:
> 
> ```switch (RdxKind) {
> case RecurKind::Add:
> case RecurKind::Mul:
> case RecurKind::Or:
> case RecurKind::And:
> case RecurKind::Xor: 
> case RecurKind::FAdd:
> case RecurKind::FMul: {
>   unsigned RdxOpcode = RecurrenceDescriptor::getOpcode(RdxKind);
>   VectorCost = TTI->getArithmeticReductionCost(
>       RdxOpcode, VectorTy, None);
>   ScalarCost = TTI->getArithmeticInstrCost(RdxOpcode, ScalarTy);
>   break;
> }```
> 
> Ideally I'd like to avoid constructing the flags here so if there is a way to pull out existing flags that's better. I'll have a look.
> Can you explain how Optional makes things better?
It's mostly conceptual that fast-math flags have no meaning for integer types, so it makes no sense to construct and pass some nonsensical FMF value for it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105432/new/

https://reviews.llvm.org/D105432



More information about the llvm-commits mailing list