[AArch64] Enable partial unrolling on cortex-a57 and 2 related improvement
kevinqindev at gmail.com
Tue Mar 3 19:03:04 PST 2015
2015-02-28 13:55 GMT+08:00 Kevin Qin <kevinqindev at gmail.com>:
> Previously, I made commit r219401 that try to enable partial & runtime
> unrolling on cortex-a57, but I forgot to call base TTI implementation in
> target specific hook, so those unrolling methods are not really enabled.
> Here are the patch to get them enabled and 2 related patches to improve
> 0001 - Run LICM pass after loop unrolling pass. Runtime unrollng will
> introduce a runtime check in loop prologue(you can treat it as a loop
> preheader). If the unrolled loop is a inner loop, then the proglogue will
> be inside the outer loop. LICM pass can help to promote the runtime check
> out if the checked value is loop invariant.
> 0002 - Introduce runtime unrolling disable matadata and use it to mark the
> scalar loop from vectorization. Runtime unrolling is an expensive
> optimization which can bring benefit only if the loop is hot and iteration
> number is relatively large enough. For some loops, we know they are not
> worth to be runtime unrolled. The scalar loop from vectorization is one of
> the cases.
> 0003 - Enable partial & runtime unrolling on cortex-a57, and double the
> unrolling threshold if the loop depth > 1. For inner one of nested loops,
> it is more likely to be a hot loop, and the runtime check can be promoted
> out from patch 0001, so the overhead is less, we can try a larger threshold
> to unroll more loops.
> Combined above changes together, we can get below performance and code
> size changes.
> Benchmark Execution time code bloat
> spec.cpu2000.179_art -16.567% 8.805%
> spec.cpu2000.177_mesa -2.771% 1.912%
> spec.cpu2006.483_xalancbmk -2.555% 0.076%
> spec.cpu2000.256_bzip2 -1.648% 2.414%
> spec.cpu2006.433_milc -1.228% 1.353%
> spec.cpu2006.456_hmmer -1.079% 2.413%
> spec.cpu2006.462_libquantum 2.492% 1.482%
> spec.cpu2000.253_perlbmk 1.563% 0.464%
> spec.cpu2006.450_soplex 1.379% 1.925%
> spec.cpu2000.186_crafty 1.242% 0.005%
> spec.geomean -0.546% 0.952%
> Please review. Thanks.
> Best Regards,
> Kevin Qin
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits