[PATCH] D113973: [LoopVectorize][CostModel] Choose smaller VFs for in-loop reductions with no loads/stores

Tue Nov 16 15:01:01 PST 2021

dmgreen added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6353
         const RecurrenceDescriptor &RdxDesc = Legal->getReductionVars()[PN];
         if (PreferInLoopReductions || useOrderedReductions(RdxDesc) ||
             TTI.preferInLoopReduction(RdxDesc.getOpcode(),
----------------
This deliberately excludes InLoopReductions from the maximum width of the register - because the phi remains scalar and it's useful for integer reductions under MVE where they can sum a vecreduce(v16i32 sext(v8i32)) in a single operation. That might not be as useful for float types - and the example loop you showed with no loads/stores in the loop is much less likely for integer - it will already have been simplified.

Perhaps we just remove this for ordered (float) reductions? Or does that lead to regressions?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113973/new/

https://reviews.llvm.org/D113973