[PATCH] [AArch64] Enable partial unrolling and runtime unrolling for AArch64 target

Kevin Qin kevinqindev at gmail.com
Mon Sep 22 03:15:55 PDT 2014


Hi,

I proposed to set 20 as loop buffer size for A57 after a lot of benchmarking experiments. Then this number will be used as a threshold to allow partial & runtime unrolling on small loops. The effect on spec2006 is,

Benchmark | Performance Improvement | Code Size Increment
400_perlbench	-1.20%	1.22%
401_bzip2	4.59%	0.00%
403_gcc	0.86%	0.36%
433_milc	-1.28%	2.71%
444_namd	-0.02%	1.32%
445_gobmk	0.38%	0.13%
447_dealII	0.29%	2.48%
450_soplex	-1.02%	3.57%
453_povray	2.80%	1.05%
456_hmmer	2.32%	3.59%
458_sjeng	0.04%	0.12%
462_libquantum	-0.11%	0.59%
464_h264ref	-0.03%	2.98%
470_lbm	0.02%	0.20%
471_omnetpp	2.67%	0.00%
473_astar	-1.38%	5.93%
482_sphinx3	-1.38%	3.98%
483_xalancbmk	-0.98%	0.71%
GEOMEAN	0.38%	1.70%

I also did experiments on SPEC2000, but there's no significant  performance impact. The geomean code size increment on SPEC2000 is 1.76%, and the increased number on all benchmarks are below 5% except 179.art, which is 14%. Fortunately, it could get partly fixed with my loop prologue simplification patch, and got 9.45% code bloat on 179.art.

Overall, this patch can bring 0.4% performance improvement on spec2006 with about 1.8% code size increment, and all  code bloats are under 10% after applying another loop prologue simplification patch. So I think it's acceptable for all optimization levels except -Os.

Thanks,
Kevin

http://reviews.llvm.org/D5148

Files:
  lib/Target/AArch64/AArch64SchedA57.td
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D5148.13921.patch
Type: text/x-patch
Size: 618 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140922/20d32458/attachment.bin>


More information about the llvm-commits mailing list