[PATCH] [AArch64] Enable partial unrolling and runtime unrolling for AArch64 target
Kevin Qin
kevinqindev at gmail.com
Mon Sep 22 03:15:55 PDT 2014
Hi,
I proposed to set 20 as loop buffer size for A57 after a lot of benchmarking experiments. Then this number will be used as a threshold to allow partial & runtime unrolling on small loops. The effect on spec2006 is,
Benchmark | Performance Improvement | Code Size Increment
400_perlbench -1.20% 1.22%
401_bzip2 4.59% 0.00%
403_gcc 0.86% 0.36%
433_milc -1.28% 2.71%
444_namd -0.02% 1.32%
445_gobmk 0.38% 0.13%
447_dealII 0.29% 2.48%
450_soplex -1.02% 3.57%
453_povray 2.80% 1.05%
456_hmmer 2.32% 3.59%
458_sjeng 0.04% 0.12%
462_libquantum -0.11% 0.59%
464_h264ref -0.03% 2.98%
470_lbm 0.02% 0.20%
471_omnetpp 2.67% 0.00%
473_astar -1.38% 5.93%
482_sphinx3 -1.38% 3.98%
483_xalancbmk -0.98% 0.71%
GEOMEAN 0.38% 1.70%
I also did experiments on SPEC2000, but there's no significant performance impact. The geomean code size increment on SPEC2000 is 1.76%, and the increased number on all benchmarks are below 5% except 179.art, which is 14%. Fortunately, it could get partly fixed with my loop prologue simplification patch, and got 9.45% code bloat on 179.art.
Overall, this patch can bring 0.4% performance improvement on spec2006 with about 1.8% code size increment, and all code bloats are under 10% after applying another loop prologue simplification patch. So I think it's acceptable for all optimization levels except -Os.
Thanks,
Kevin
http://reviews.llvm.org/D5148
Files:
lib/Target/AArch64/AArch64SchedA57.td
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D5148.13921.patch
Type: text/x-patch
Size: 618 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140922/20d32458/attachment.bin>
More information about the llvm-commits
mailing list