[all-commits] [llvm/llvm-project] a5dd6c: [LoopVectorize] Don't interleave scalar ordered re...

david-arm via All-commits all-commits at lists.llvm.org
Tue Jul 27 09:50:17 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: a5dd6c6cf9356f7e7c4611a0d5c198ae7cd34106
      https://github.com/llvm/llvm-project/commit/a5dd6c6cf9356f7e7c4611a0d5c198ae7cd34106
  Author: David Sherwood <david.sherwood at arm.com>
  Date:   2021-07-27 (Tue, 27 Jul 2021)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    A llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll

  Log Message:
  -----------
  [LoopVectorize] Don't interleave scalar ordered reductions for inner loops

Consider the following loop:

  void foo(float *dst, float *src, int N) {
    for (int i = 0; i < N; i++) {
      dst[i] = 0.0;
      for (int j = 0; j < N; j++) {
        dst[i] += src[(i * N) + j];
      }
    }
  }

When we are not building with -Ofast we may attempt to vectorise the
inner loop using ordered reductions instead. In addition we also try
to select an appropriate interleave count for the inner loop. However,
when choosing a VF=1 the inner loop will be scalar and there is existing
code in selectInterleaveCount that limits the interleave count to 2
for reductions due to concerns about increasing the critical path.
For ordered reductions this problem is even worse due to the additional
data dependency, and so I've added code to simply disable interleaving
for scalar ordered reductions for now.

Test added here:

  Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll

Differential Revision: https://reviews.llvm.org/D106646




More information about the All-commits mailing list