[PATCH] D122148: [SLP] Peek into loads when hitting the RecursionMaxDepth
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 4 07:44:23 PDT 2022
dmgreen added a comment.
SPEC I've tried for AArch64 - that was one of the places that showed the need for something like this. For AArch64 along with D122145 <https://reviews.llvm.org/D122145> this helps one of the routines in x264 to speed up the whole thing. The exact speedup is quite dependant on the ordering of shuffles chosen, and may need some more work to come out as good as it can. The introduction of select shuffles wasn't great for targets without them. We are talking about something like a 5% improvement.
I've been trying to run X86 too, but I don't have a great setup for it. On what I think is some sort of "Xeon 6148" thing, these were the performance scores I saw from running SPEC 2017 with -march=native (again with this and D122145 <https://reviews.llvm.org/D122145>):
500.perlbench_r 0.829412281
502.gcc_r 0.398122431
505.mcf_r -0.352758469
520.omnetpp_r 0.256906516
523.xalancbmk_r -0.789195872
525.x264_r 0.198126785
531.deepsjeng_r -0.024245574
541.leela_r 0.003397322
557.xz_r -0.722254612
They can be quite noisy though I'm afraid, even if those results are averaged between three runs. I wouldn't be surprised if all those changes were down to machine noise or knock-on alignment changes. We have a better setup for AArch64 than we do for X86, but there is always some noise.
I tried 2006 too on AArch64. It was only the perlbench binary that changed there, according to the file hashes. And the performance was the same.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D122148/new/
https://reviews.llvm.org/D122148
More information about the llvm-commits
mailing list