[PATCH] [AArch64] Enable partial unrolling and runtime unrolling for AArch64 target

Kevin Qin kevinqindev at gmail.com
Tue Sep 9 03:19:24 PDT 2014


I can give more details on the performance regressions here. Basically, partial unrolling contributes small performance improvement, regressions and code size changes. Most of the regressions are caused by the runtime unrolling prologue. This prologue will do some extra work on checking the loop iterations and execute the reminder times of loop bodies. If the runtime unrolled loop is inside another loop, and inner loop count for each running is quite small, then there's a overhead happened in the prologue and caused the regession. 

Maybe we can do some heuristic work to avoid unrolling such loops, but it's hard to decide which kind of heuristic logic should be used here, because most information can help to make this decision is coming from run time, which is difficult to be estimated at compling time. So maybe it need spend a lot of time to tune and the benefit is unknown.

In other hand, the runtime unrolling algorithm is already there and proved to give performance improvments for SPEC2000 and SPEC2006. So I suggest we can enable it first, and then try to get it even better in future.

Regards,
Kevin

http://reviews.llvm.org/D5148






More information about the llvm-commits mailing list