[PATCH] D127604: [SLP][X86] Add 32-bit vector stores to help vectorization opportunities

Dinar Temirbulatov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 13 09:08:22 PDT 2022


dtemirbulatov added a comment.

> I haven't had time to properly test these (my last testing was with D103925 <https://reviews.llvm.org/D103925> which attempted something similar), so if anyone has a working test suite instance to hand that'd be very useful!

I have one testsuite instance with i7-6700HQ fixed frequancy at 2.6GHz, following LNT flags were used "--threads 1 --use-perf=all --cflags '-O3 -mavx2'"
Here is columns names: 1) Testname 2)  Hash sum of a binary before 3) Hash sum of a binary after 4) Run-time before 5) Run-time after 6) Cycles before 7) Cycles after

MultiSource/Applications/ALAC/decode/alacconvert-decode 9ea454d8e30a2ae8b90b6083a69455b2 1c522b3ecdb16b62e8031344f0b43a6e from 1.8080 to 1.8010 cycles 46 877 690 to 46 714 665
MultiSource/Applications/ALAC/encode/alacconvert-encode 9ea454d8e30a2ae8b90b6083a69455b2 1c522b3ecdb16b62e8031344f0b43a6e from 3.3780 to 3.3810 cycles 87 608 023 to 87 683 603
MultiSource/Applications/ClamAV/clamscan 1e4ff4bbe17058784446880fb1cbdbc3 6317ae08724b50b102cec3820aec1504 from 12.5110 to 12.5020 cycles 324 175 926 to 323 787 667
MultiSource/Applications/JM/ldecod/ldecod eb714519f2cbc75429558b56fc2526ba ff629bdfd346f7d43f156ed36b51c23e from 5.3510 to 5.3560 cycles 138 764 852 to 138 902 747
MultiSource/Applications/JM/lencod/lencod 271b2504c6fed9507b50267549d4913c 7df1e610fee456e9b6f21203b64bc1f6 from 0.0040 to 0.0040 cycles 10 687 719 213 to 11 102 500 468
MultiSource/Applications/sqlite3/sqlite3 01e54145dfcbb5fcaf59a7b633cd1eb1 8c89b8ab52017c5f7be8fd5a74424d2e from 0.0020 to 0.0020 cycles 6 083 405 311 to 6 086 694 618
MultiSource/Benchmarks/7zip/7zip-benchmark c22a0b720a5d3ae828b967865ed7a5c8 b9be55c3b6675c269fb6a38f7e1d984d from 0.0080 to 0.0080 cycles 22 662 222 432 to 22 768 754 448
MultiSource/Benchmarks/Bullet/bullet 725b4974cd846b7b902e8f0fe602935a 2c46c0aded63f473610748bfec1ef5cd from 0.0030 to 0.0030 cycles 9 137 473 246 to 9 075 514 873
MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR 3c995d9e1ed43a9b9343f0bf6cd5d006 3cc8fb2fd96528b9b5fffccb74bdb10e from 0.0010 to 0.0010 cycles 3 395 491 185 to 3 398 163 861
MultiSource/Benchmarks/MallocBench/gs/gs f8bc7dc74490b2b7da6dc4db4219bdc8 d838ca23ddf4b83cebc775d336bd77df from 3.0370 to 2.9790 cycles 78 655 771 to 77 251 300
MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg 97aaf98dbf91379e22c223c997d65b18 ec84ca9a186c2343d393c46b834c1df9 from 0.4150 to 0.4160 cycles 10 764 699 to 10 782 001
MultiSource/Benchmarks/MiBench/consumer-typeset/consumer-typeset b686e79fa11d74e19c1e25fc8b1b657d bff9b7302b95e2404e0a4d03776cc004 from 10.7630 to 10.7590 cycles 278 964 258 to 278 907 150
MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm 5b5eb8b3110e7e44faba9a39886f7eb9 3576c209ee3a3d7b5e61078ffe8bff8a from 7.5110 to 7.5190 cycles 194 825 194 to 195 024 904
MultiSource/Benchmarks/PAQ8p/paq8p a05ee284cc2aafc353525716ead34a46 271e3657c4254b44130ce1084ea3c31b from 0.0200 to 0.0210 cycles 54 418 140 186 to 54 629 177 762
MultiSource/Benchmarks/Prolangs-C/agrep/agrep 499c7b076644e0f05f509b8c9284ebbb cd4e690f68fb6f1de9180c712e6a1fcf from 0.3180 to 0.3160 cycles 8 222 873 to 8 192 390
MultiSource/Benchmarks/Prolangs-C/assembler/assembler 91ad4757272564f23c7ae5ccc82cf2cb e6ad7cda9bb5205bb3c16fb0c37eff9b from 0.1050 to 0.1050 cycles 2 727 287 to 2 709 236
MultiSource/Benchmarks/mediabench/g721/g721encode/encode 561896ca63930f7d23008d877670acb3 cd5779008318f623938dfa79c927b9d9 from 4.0160 to 3.9910 cycles 104 151 746 to 103 508 623
MultiSource/Benchmarks/mediabench/gsm/toast/toast 3882c68e752c5e4cd5d2ed73124603cb 3bb4cb4aa6ea339948820afbdd63ee66 from 1.0230 to 1.0220 cycles 26 530 366 to 26 480 722
MultiSource/Benchmarks/nbench/nbench 123c00a5e35f543e947426940c3b0f4b 12332b6a280ca58ea23af17223380627 from 0.0010 to 0.0010 cycles 3 427 928 721 to 3 483 881 476
MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 c1cfe1bbcaa5a44502a2ce1dc8ad1990 a606ae7ed6888288475e7b14b506f698 from 15.2060 to 15.2050 cycles 394 425 741 to 394 390 513
SingleSource/Benchmarks/SmallPT/smallpt a55897d3f1b34a808cf8579ebb82856d 57116ccef43a438580537031893e70cf from 0.0060 to 0.0060 cycles 15 898 258 261 to 15 791 900 523


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127604/new/

https://reviews.llvm.org/D127604



More information about the llvm-commits mailing list