[AArch64] Enable partial unrolling on cortex-a57 and 2 related improvement

Kevin Qin kevinqindev at gmail.com
Wed Mar 4 01:09:51 PST 2015


Hi Hal,

Thanks a lot for your review. I totally agree with your review comments,
and here are the updated patches.

Regards,
Kevin

2015-03-04 11:29 GMT+08:00 Hal Finkel <hfinkel at anl.gov>:

> ----- Original Message -----
> > From: "Kevin Qin" <kevinqindev at gmail.com>
> > To: "llvm-commits" <llvm-commits at cs.uiuc.edu>
> > Sent: Friday, February 27, 2015 11:55:05 PM
> > Subject: [AArch64] Enable partial unrolling on cortex-a57 and 2 related
>      improvement
> >
> >
> >
> > Hi,
> >
> >
> > Previously, I made commit r219401 that try to enable partial &
> > runtime unrolling on cortex-a57, but I forgot to call base TTI
> > implementation in target specific hook, so those unrolling methods
> > are not really enabled.
> >
> >
> > Here are the patch to get them enabled and 2 related patches to
> > improve it.
> >
> >
> > 0001 - Run LICM pass after loop unrolling pass. Runtime unrollng will
> > introduce a runtime check in loop prologue(you can treat it as a
> > loop preheader). If the unrolled loop is a inner loop, then the
> > proglogue will be inside the outer loop. LICM pass can help to
> > promote the runtime check out if the checked value is loop
> > invariant.
>
> I think makes sense, at least for LICM, and is consistent with what James
> observed from the early run of the unroller. Please add a comment
> explaining why those passes are there. This file does not have many
> 'rationale' comments, and this is not a good thing. Why are you adding CVP?
> Can you please add some test cases (we normally don't add tests that runs
> the full pipeline, but for testing the pipeline, it is a good idea).
>
> >
> >
> > 0002 - Introduce runtime unrolling disable matadata and use it to
> > mark the scalar loop from vectorization. Runtime unrolling is an
> > expensive optimization which can bring benefit only if the loop is
> > hot and iteration number is relatively large enough. For some loops,
> > we know they are not worth to be runtime unrolled. The scalar loop
> > from vectorization is one of the cases.
>
> I think this is a good idea. However, I think we might be overlooking
> something. If the purpose of the scalar loop is only to handle the 'left
> over' part of the iteration space that is not divisible by the vector
> length. However, if there are runtime safety checks, and those checks
> generally fail, then the loop could be hot. Can we exclude the case where
> we've emitted safety checks?
>
> >
> >
> > 0003 - Enable partial & runtime unrolling on cortex-a57, and double
> > the unrolling threshold if the loop depth > 1. For inner one of
> > nested loops, it is more likely to be a hot loop, and the runtime
> > check can be promoted out from patch 0001, so the overhead is less,
> > we can try a larger threshold to unroll more loops.
> >
>
>
> +  if (L->getLoopDepth() > 1)
> +    UP.PartialThreshold *= 2;
>
> Please add a comment here.
>
>  -Hal
>
> >
> >
> >
> > Combined above changes together, we can get below performance and
> > code size changes.
> >
> >
> > Benchmark Execution time code bloat
> >
> >
> > spec.cpu2000.179_art -16.567% 8.805%
> > spec.cpu2000.177_mesa -2.771% 1.912%
> > spec.cpu2006.483_xalancbmk -2.555% 0.076%
> > spec.cpu2000.256_bzip2 -1.648% 2.414%
> > spec.cpu2006.433_milc -1.228% 1.353%
> > spec.cpu2006.456_hmmer -1.079% 2.413%
> >
> > spec.cpu2006.462_libquantum 2.492% 1.482%
> > spec.cpu2000.253_perlbmk 1.563% 0.464%
> > spec.cpu2006.450_soplex 1.379% 1.925%
> > spec.cpu2000.186_crafty 1.242% 0.005%
> >
> > spec.geomean -0.546% 0.952%
> >
> >
> > Please review. Thanks.
> >
> >
> > --
> >
> >
> > Best Regards,
> >
> >
> > Kevin Qin
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>



-- 
Best Regards,

Kevin Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150304/12f3070b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Run-LICM-pass-after-loop-unrolling-pass.patch
Type: text/x-patch
Size: 2722 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150304/12f3070b/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Introduce-runtime-unrolling-disable-matadata-and-use.patch
Type: text/x-patch
Size: 62870 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150304/12f3070b/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Enable-partial-runtime-unrolling-on-cortex-a57-and-d.patch
Type: text/x-patch
Size: 5531 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150304/12f3070b/attachment-0002.bin>


More information about the llvm-commits mailing list