[all-commits] [llvm/llvm-project] 8f337f: [VectorCombine] generalize pass param name for ear...

Mon Nov 21 10:58:12 PST 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 8f337f8ffe36fe85b2015f4765c4f0f9a59d46d1
      https://github.com/llvm/llvm-project/commit/8f337f8ffe36fe85b2015f4765c4f0f9a59d46d1
  Author: Sanjay Patel <spatel at rotateright.com>
  Date:   2022-11-21 (Mon, 21 Nov 2022)

  Changed paths:
    M llvm/include/llvm/Transforms/Vectorize/VectorCombine.h
    M llvm/lib/Passes/PassBuilderPipelines.cpp
    M llvm/lib/Transforms/Vectorize/VectorCombine.cpp

  Log Message:
  -----------
  [VectorCombine] generalize pass param name for early combines; NFC

The option was added with https://reviews.llvm.org/D102496,
and currently the name is accurate, but I am hoping to add
a load transform that is not a scalarization. See issue #17113.

  Commit: 163bb6d64e5f1220777c3ec2a8b58c0666a74d91
      https://github.com/llvm/llvm-project/commit/163bb6d64e5f1220777c3ec2a8b58c0666a74d91
  Author: Sanjay Patel <spatel at rotateright.com>
  Date:   2022-11-21 (Mon, 21 Nov 2022)

  Changed paths:
    M llvm/lib/Passes/PassBuilderPipelines.cpp
    M llvm/lib/Transforms/Vectorize/VectorCombine.cpp
    M llvm/test/Other/new-pm-defaults.ll
    M llvm/test/Other/new-pm-thinlto-defaults.ll
    M llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
    M llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
    M llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
    M llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
    M llvm/test/Transforms/PhaseOrdering/X86/vec-load-combine.ll

  Log Message:
  -----------
  [Passes][VectorCombine] enable early run generally and try load folds

An early run of VectorCombine was added with D102496 specifically to
deal with unnecessary vector ops produced with the C matrix extension.
This patch is proposing to try those folds in general and add a pair
of load folds to the menu.

The load transform will partly solve (see PhaseOrdering diffs) a
longstanding vectorization perf bug by removing redundant loads via GVN:
issue #17113

The main reason for not enabling the extra pass generally in the initial
patch was compile-time cost. The cost of VectorCombine was significantly
(surprisingly) improved with:
87debdadaf18
https://llvm-compile-time-tracker.com/compare.php?from=ffe05b8f57d97bc4340f791cb386c8d00e0739f2&to=87debdadaf18f8a5c7e5d563889e10731dc3554d&stat=instructions:u

...so the extra run is going to cost very little now - the total cost of
the 2 runs should be less than the 1 run before that micro-optimization:
https://llvm-compile-time-tracker.com/compare.php?from=5e8c2026d10e8e2c93c038c776853bed0e7c8fc1&to=2c4b68eab5ae969811f422714e0eba44c5f7eefb&stat=instructions:u

It may be possible to reduce the cost slightly more with a few more
earlier-exits like that, but it's probably in the noise based on timing
experiments.

Differential Revision: https://reviews.llvm.org/D138353

Compare: https://github.com/llvm/llvm-project/compare/ed7870c2e5c4...163bb6d64e5f