[PATCH] D113973: [LoopVectorize][CostModel] Choose smaller VFs for in-loop reductions with no loads/stores

Wed Nov 17 00:44:48 PST 2021

sdesmalen added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6353
         const RecurrenceDescriptor &RdxDesc = Legal->getReductionVars()[PN];
         if (PreferInLoopReductions || useOrderedReductions(RdxDesc) ||
             TTI.preferInLoopReduction(RdxDesc.getOpcode(),
----------------
dmgreen wrote:
> This deliberately excludes InLoopReductions from the maximum width of the register - because the phi remains scalar and it's useful for integer reductions under MVE where they can sum a vecreduce(v16i32 sext(v8i32)) in a single operation. That might not be as useful for float types - and the example loop you showed with no loads/stores in the loop is much less likely for integer - it will already have been simplified.
> 
> Perhaps we just remove this for ordered (float) reductions? Or does that lead to regressions?
> This deliberately excludes InLoopReductions from the maximum width of the register - because the phi remains scalar and it's useful for integer reductions under MVE where they can sum a vecreduce(v16i32 sext(v8i32)) in a single operation.
Yes, the same is true for SVE. There is also code in the cost-model to recognise the extend to the wider type.

I think the actual reason for the reduction-code being here is to avoid ending up with vector PHIs that are too wide (out-of-loop reduction). The checks for in-loop reductions were added later because (1) there is no vector PHI and (2) that it doesn't limit the VF too early so that it lets the cost-model code consider the wider VFs for the reason you described.

> Perhaps we just remove this for ordered (float) reductions? Or does that lead to regressions?
I don't think we should add specific knowledge to limit the VF for fp-reductions here, because that means adding target-specific knowledge to the `collectElementTypesForWidening`, which is something that the cost-model should decide on. Also it would lead to regressions.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113973/new/

https://reviews.llvm.org/D113973