[all-commits] [llvm/llvm-project] c7dbe3: [AArch64][LoopVectorize] Enable tail-folding of si...

Thu May 18 03:36:13 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: c7dbe326dff81273eabe339fe69cd7bef947619c
      https://github.com/llvm/llvm-project/commit/c7dbe326dff81273eabe339fe69cd7bef947619c
  Author: David Sherwood <david.sherwood at arm.com>
  Date:   2023-05-18 (Thu, 18 May 2023)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64Subtarget.cpp
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-vscale-tune.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-overflow-checks.ll

  Log Message:
  -----------
  [AArch64][LoopVectorize] Enable tail-folding of simple loops on neoverse-v1

This patch enables the tail-folding of simple loops by default
when targeting the neoverse-v1 CPU. Simple loops exclude those
with recurrences or reductions or loops that are reversed.

New tests have been added here:

Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll

In terms of SPEC2017 only one benchmark is really affected when
building with "-Ofast -mcpu=neoverse-v1 -flto", which is
(+ faster, - slower):

525.x264: +7.0%

Differential Revision: https://reviews.llvm.org/D130618