[all-commits] [llvm/llvm-project] e473da: [AArch64] Improve cost computations for odd vector...

Wed Jan 17 13:32:18 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: e473daa7797db6e0f45ef9e12081ccce7d2ed26f
      https://github.com/llvm/llvm-project/commit/e473daa7797db6e0f45ef9e12081ccce7d2ed26f
  Author: Florian Hahn <flo at fhahn.com>
  Date:   2024-01-17 (Wed, 17 Jan 2024)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/AArch64/vec3-ops.ll

  Log Message:
  -----------
  [AArch64] Improve cost computations for odd vector mem ops. (#78181)

Improve cost computaton for odd vector mem ops by breaking them down
into smaller power-of-2 parts and sum up the cost of those parts.

This fixes the current cost estimates, which for most parts
underestimated the cos, due to using getTypeLegalizationCost, which
widens to the next power-of-2 in a single step in most cases. This
doesn't reflect the actual cost.

See https://llvm.godbolt.org/z/vMsnxMf1v for codegen for the tests.

Note that there is a special case for v3i8, for which current codegen is
pretty bad, due to automatic widening to v4i8, which in turn requires
the conversion to go through memory ops in the stack. I am planning on
fixing that as a follow-up, but I am not yet sure where to best fix
this.

At the moment, there are almost no cases in which such vector operations
will be generated automatically. The motivating case is non-power-of-2
SLP vectorization: https://github.com/llvm/llvm-project/pull/77790

PR: https://github.com/llvm/llvm-project/pull/78181