[all-commits] [llvm/llvm-project] 962fba: [LoopVectorize] Refine runtime memory check costs ...

Fri Jan 26 06:44:00 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 962fbafecf4730ba84a3b9fd7a662a5c30bb2c7c
      https://github.com/llvm/llvm-project/commit/962fbafecf4730ba84a3b9fd7a662a5c30bb2c7c
  Author: David Sherwood <57997763+david-arm at users.noreply.github.com>
  Date:   2024-01-26 (Fri, 26 Jan 2024)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    A llvm/test/Transforms/LoopVectorize/AArch64/low_trip_memcheck_cost.ll

  Log Message:
  -----------
  [LoopVectorize] Refine runtime memory check costs when there is an outer loop (#76034)

When we generate runtime memory checks for an inner loop it's
possible that these checks are invariant in the outer loop and
so will get hoisted out. In such cases, the effective cost of
the checks should reduce to reflect the outer loop trip count.

This fixes a 25% performance regression introduced by commit

49b0e6dcc296792b577ae8f0f674e61a0929b99d

when building the SPEC2017 x264 benchmark with PGO, where we
decided the inner loop trip count wasn't high enough to warrant
the (incorrect) high cost of the runtime checks. Also, when
runtime memory checks consist entirely of diff checks these are
likely to be outer loop invariant.