[all-commits] [llvm/llvm-project] 96b2e3: [CostModel][AArch64] Improve fixed-width vector co...

Wed Apr 24 06:31:29 PDT 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 96b2e35a58819eb2fbe1821650e35a1f0e085bd7
      https://github.com/llvm/llvm-project/commit/96b2e35a58819eb2fbe1821650e35a1f0e085bd7
  Author: David Sherwood <57997763+david-arm at users.noreply.github.com>
  Date:   2024-04-24 (Wed, 24 Apr 2024)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll

  Log Message:
  -----------
  [CostModel][AArch64] Improve fixed-width vector costs for get.active.lane.mask (#89068)

When SVE is available we can lower calls to get.active.lane.mask using
the SVE whilelo instruction, however in practice since vXi1 types are
not legal for NEON we often end up expanding the predicate into a vector
of integers, e.g. v4i1 -> v4i32. This usually happens when we have to
keep the predicate live out of the block, for example when the predicate
is the incoming value to a PHI node in a tail-folded vector loop.
Currently in such cases the intrinsic call has a cost of 1, which is far
too low when considering the extra instructions required to expand the
predicate. This patch fixes that by basing the cost on the number of
lane moves required for expansion. This is required for a follow-on
patch that adds the cost of the intrinsic call to the vectorisation cost
model, so that we can teach the vectoriser to make better choices.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications