[all-commits] [llvm/llvm-project] b87033: [SLP] Match poison as instruction with the same op...

Fri Nov 22 13:10:39 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: b8703369daf777706196ff914c0376c27adde3cf
      https://github.com/llvm/llvm-project/commit/b8703369daf777706196ff914c0376c27adde3cf
  Author: Alexey Bataev <a.bataev at outlook.com>
  Date:   2024-11-22 (Fri, 22 Nov 2024)

  Changed paths:
    M llvm/include/llvm/IR/Instruction.h
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    M llvm/test/Transforms/SLPVectorizer/AArch64/reused-scalar-repeated-in-node.ll
    M llvm/test/Transforms/SLPVectorizer/AArch64/vectorize-free-extracts-inserts.ll
    M llvm/test/Transforms/SLPVectorizer/X86/landing_pad.ll

  Log Message:
  -----------
  [SLP] Match poison as instruction with the same opcode

Patch allows to vector scalar instruction + poison values as if poisons
are instructions with the same opcode. It allows better vectorization of
the repeated values, reduces number of insertelement instructions and
serves as a base ground for copyable elements vectorization

AVX512, -O3 + LTO

JM/ldecod - better vector code
Applications/oggenc - better vectorization
CINT2017speed/625.x264_s
CINT2017rate/525.x264_r - better vector code
CFP2017rate/526.blender_r - better vector code
CFP2006/447.dealII - small variations
Benchmarks/Bullet - extra vector code
CFP2017rate/510.parest_r - better vectorization
CINT2017rate/502.gcc_r
CINT2017speed/602.gcc_s - extra vector code
Benchmarks/tramp3d-v4 - small variations
CFP2006/453.povray - extra vector code
JM/lencod - better vector code
CFP2017rate/511.povray_r - extra vector code
MemFunctions/MemFunctions - extra vector code
LoopVectorization/LoopVectorizationBenchmarks - extra vector code
XRay/FDRMode - extra vector code
XRay/ReturnReference - extra vector code
LCALS/SubsetCLambdaLoops - extra vector code
LCALS/SubsetCRawLoops - extra vector code
LCALS/SubsetARawLoops - extra vector code
LCALS/SubsetALambdaLoops - extra vector code
DOE-ProxyApps-C++/miniFE - extra vector code
LoopVectorization/LoopInterleavingBenchmarks - extra vector code
LCALS/SubsetBLambdaLoops - extra vector code
MicroBenchmarks/harris - extra vector code
ImageProcessing/Dither - extra vector code
MicroBenchmarks/SLPVectorization - extra vector code
ImageProcessing/Blur - extra vector code
ImageProcessing/Dilate - extra vector code
Builtins/Int128 - extra vector code
ImageProcessing/Interpolation - extra vector code
ImageProcessing/BilateralFiltering - extra vector code
ImageProcessing/AnisotropicDiffusion - extra vector code
MicroBenchmarks/LoopInterchange - extra code vectorized
LCALS/SubsetBRawLoops - extra code vectorized
CINT2006/464.h264ref - extra vectorization with wider vectors
CFP2017rate/508.namd_r - small variations, extra phis vectorized
CFP2006/444.namd - 2 2 x phi replaced by 4 x phi
DOE-ProxyApps-C/SimpleMOC - extra code vectorized
CINT2017rate/541.leela_r
CINT2017speed/641.leela_s - the function better vectorized and inlined
Benchmarks/Misc/oourafft - 2 4 x bit reductions replaced by 2 x vector code
FreeBench/fourinarow - better vectorization

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/115946

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications