[all-commits] [llvm/llvm-project] 961f51: [LoopVectorize][CostModel] Choose smaller VFs for ...

Tue Jan 4 02:26:14 PST 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 961f51fdf04fd14f5dc5e7a6d53a5460249d947c
      https://github.com/llvm/llvm-project/commit/961f51fdf04fd14f5dc5e7a6d53a5460249d947c
  Author: Rosie Sumpter <rosie.sumpter at arm.com>
  Date:   2022-01-04 (Tue, 04 Jan 2022)

  Changed paths:
    M llvm/include/llvm/Analysis/IVDescriptors.h
    M llvm/lib/Analysis/IVDescriptors.cpp
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll
    M llvm/test/Transforms/LoopVectorize/X86/funclet.ll

  Log Message:
  -----------
  [LoopVectorize][CostModel] Choose smaller VFs for in-loop reductions without loads/stores

For loops that contain in-loop reductions but no loads or stores, large
VFs are chosen because LoopVectorizationCostModel::getSmallestAndWidestTypes
has no element types to check through and so returns the default widths
(-1U for the smallest and 8 for the widest). This results in the widest
VF being chosen for the following example,

float s = 0;
for (int i = 0; i < N; ++i)
  s += (float) i*i;

which, for more computationally intensive loops, leads to large loop
sizes when the operations end up being scalarized.

In this patch, for the case where ElementTypesInLoop is empty, the widest
type is determined by finding the smallest type used by recurrences in
the loop instead of falling back to a default value of 8 bits. This
results in the cost model choosing a more sensible VF for loops like
the one above.

Differential Revision: https://reviews.llvm.org/D113973