[all-commits] [llvm/llvm-project] c5329c: [LV][AArch64] Prefer Fixed over Scalable if cost-m...

Wed Jul 17 02:46:56 PDT 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: c5329c827ab345c4390f9a176573816e7f5c19e3
      https://github.com/llvm/llvm-project/commit/c5329c827ab345c4390f9a176573816e7f5c19e3
  Author: Sjoerd Meijer <smeijer at nvidia.com>
  Date:   2024-07-17 (Wed, 17 Jul 2024)

  Changed paths:
    M llvm/include/llvm/Analysis/TargetTransformInfo.h
    M llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
    M llvm/lib/Analysis/TargetTransformInfo.cpp
    M llvm/lib/Target/AArch64/AArch64Features.td
    M llvm/lib/Target/AArch64/AArch64Processors.td
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    A llvm/test/Transforms/LoopVectorize/AArch64/prefer-fixed-if-equal-to-scalable.ll

  Log Message:
  -----------
  [LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2) (#95819)

    For the Neoverse V2 we would like to prefer fixed width over scalable
    vectorisation if the cost-model assigns an equal cost to both for certain
    loops. This improves 7 kernels from TSVC-2 and several production kernels by
    about 2x, and does not affect SPEC21017 INT and FP. This also adds a new TTI
    hook that can steer the loop vectorizater to preferring fixed width
    vectorization, which can be set per CPU. For now, this is only enabled for the
    Neoverse V2.

    There are 3 reasons why preferring NEON might be better in the case the
    cost-model is a tie and the SVE vector size is the same as NEON (128-bit):
    architectural reasons, micro-architecture reasons, and SVE codegen reasons. The
    latter will be improved over time, so the more important reasons are the former
    two. I.e., (micro) architecture reason is the use of LPD/STP instructions which
    are not available in SVE2 and it avoids predication.

    For what it is worth: this codegen strategy to generate more NEON is inline
    with GCC's codegen strategy, which is actually even more aggressive in
    generating NEON when no predication is required. We could be smarter about the
    decision making, but this seems to be a first good step in the right direction,
    and we can always revise this later (for example make the target hook more
    general).


To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications