[all-commits] [llvm/llvm-project] 1c4fed: [LoopVectorize] Don't tail-fold for scalable VFs w...

Mon Mar 27 01:34:49 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 1c4fedfa35aeb8b456e2d8f4f826c0e026b9d863
      https://github.com/llvm/llvm-project/commit/1c4fedfa35aeb8b456e2d8f4f826c0e026b9d863
  Author: David Sherwood <david.sherwood at arm.com>
  Date:   2023-03-27 (Mon, 27 Mar 2023)

  Changed paths:
    M llvm/include/llvm/Analysis/TargetTransformInfo.h
    M llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
    M llvm/include/llvm/CodeGen/BasicTTIImpl.h
    M llvm/lib/Analysis/TargetTransformInfo.cpp
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/lib/Target/AArch64/AArch64ISelLowering.h
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
    M llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/test/Transforms/LoopVectorize/AArch64/eliminate-tail-predication.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding.ll

  Log Message:
  -----------
  [LoopVectorize] Don't tail-fold for scalable VFs when there is no scalar tail

Currently in LoopVectorize we avoid tail-folding if we can
prove the trip count is always a multiple of the maximum
fixed-width VF. This works because we know the vectoriser
only ever chooses a VF that is a power of 2. However, if
we are also considering scalable VFs then we conservatively
bail out of the optimisation because we don't know the value
of vscale, which could be an odd or prime number, etc.

This patch tries to enable the same optimisation for scalable
VFs by asking if vscale is known to be a power of 2. If so,
we can then query the maximum value of vscale and use the same
logic as we do for fixed-width VFs. I've also added a new TTI
hook called isVScaleKnownToBeAPowerOfTwo that does the same
thing as the existing TargetLowering hook.

Differential Revision: https://reviews.llvm.org/D146199