[PATCH] D155246: [SLP]Improve stores vectorization.
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 13 16:08:59 PDT 2023
ABataev created this revision.
ABataev added reviewers: RKSimon, vdmitrie.
Herald added subscribers: vporpo, hiraditya.
Herald added a project: All.
ABataev requested review of this revision.
Herald added a subscriber: wangpc.
Herald added a project: LLVM.
Use O(nlogn) instead of O(N2) (N <= 32) sorting approach and do not try
to revectorize all possible combinations of stores, if they
definitely cannot be combined because of mem/data dependencies.
Compile time (O3 <https://reviews.llvm.org/owners/package/3/> + lto, skylake_avx512):
External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test 117.15 120.11 2.5%
External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test 203.67 207.42 1.8%
External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 232.43 235.01 1.1%
External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test 205.49 207.25 0.9%
External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 310.46 306.23 -1.4%
Link time (O3 <https://reviews.llvm.org/owners/package/3/>+lto, skylake_avx512):
External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 1383.69 1475.94 6.7%
Other changes are too small, cannot rely on them.
size..text
Program size..text
results results0 diff
test-suite :: SingleSource/Regression/C/Regression-C-sumarray.test 392.00 1439.00 267.1%
test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test 394258.00 394818.00 0.1%
test-suite :: MultiSource/Applications/JM/lencod/lencod.test 846355.00 847075.00 0.1%
test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test 782816.00 783360.00 0.1%
test-suite :: External/SPEC/CFP2017rate/508.namd_r/508.namd_r.test 779667.00 779923.00 0.0%
test-suite :: MultiSource/Benchmarks/mafft/pairlocalalign.test 224398.00 224446.00 0.0%
test-suite :: MultiSource/Applications/oggenc/oggenc.test 185019.00 185035.00 0.0%
test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12487610.00 12488010.00 0.0%
test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 1051772.00 1051804.00 0.0%
test-suite :: MultiSource/Applications/SPASS/SPASS.test 529586.00 529602.00 0.0%
test-suite :: External/SPEC/CINT2006/400.perlbench/400.perlbench.test 1084684.00 1084716.00 0.0%
test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test 1014245.00 1014261.00 0.0%
test-suite :: MultiSource/Benchmarks/MallocBench/espresso/espresso.test 223494.00 223478.00 -0.0%
test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 660843.00 660795.00 -0.0%
test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 660843.00 660795.00 -0.0%
test-suite :: MultiSource/Applications/ClamAV/clamscan.test 568824.00 568760.00 -0.0%
espresso - 2 more stores vectorized
x264 - small number of changes in 3-4 functions, generated a bit more
vector stores (2 4x zeroinitializer stores + some other small variations).
clamscan - emitted 32xi8 store instead of several scalar stores + several 4x-8x stores.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D155246
Files:
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
llvm/test/Transforms/SLPVectorizer/X86/many_stores.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D155246.540209.patch
Type: text/x-patch
Size: 13152 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230713/42951a37/attachment.bin>
More information about the llvm-commits
mailing list