[PATCH] D42006: AArch64: Omit callframe setup/destroy when not necessary

Matthias Braun via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 30 16:19:24 PST 2018


So I took 453.povray (spec2006) and I cannot reproduce the performance regression on our devices (if there is any change than it is below the measurement noise, that particular benchmark seems to be noisy in general for us). I tried ref and train input sets.

I used the following flags to hopefully get close to what you measured:
    -O3 -Xclang,-target-feature -Xclang,+use-postra-scheduler -fno-math-errno -ffp-contract=fast -fomit-frame-pointer      [1]

Also comparing assembly of the hottest functions (from train dataset) showed nothing that would explain a performance swing:

pov::All_CSG_Intersect_Intersections(pov::Object_Struct*, pov::Ray_Struct*, pov::istack_struct*)
    - some small changes in register numbering, same instructions
pov::All_Plane_Intersections(pov::Object_Struct*, pov::Ray_Struct*, pov::istack_struct*)
    - no changes
pov::All_Sphere_Intersections(pov::Object_Struct*, pov::Ray_Struct*, pov::istack_struct*)
    - 3 instructions scheduled differently, 1 stp moves closer to a memcpy
pov::Check_And_Enqueue(pov::Priority_Queue_Struct*, pov::BBox_Tree_Struct*, pov::Bounding_Box_Struct*, pov::Rayinfo_Struct*)
    - 1 lsl scheduled later
pov::Intersect_BBox_Tree(pov::BBox_Tree_Struct*, pov::Ray_Struct*, pov::istk_entry*, pov::Object_Struct**, bool)
    - ldr moved inside a sequence of ldrs.
pov::DNoise(double*, double*)
    - no changes
pov::All_Quadric_Intersections(pov::Object_Struct*, pov::Ray_Struct*, pov::istack_struct*)
    - no changes

Could the swings be explained with something in your environment? Could you dive in an see whether you can spot differences in the assembly?

- Matthias



[1] For reference:
clang++  -DNDEBUG  -save-temps=obj -save-stats=obj -Xclang,-target-feature -Xclang,+use-postra-scheduler -fno-math-errno -ffp-contract=fast -fomit-frame-pointer  -B /Applications/Xcode.app/Contents/Developer/Toolchains/iOS11.1.xctoolchain/usr/bin  -O3 -DNDEBUG -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS11.1.Internal.sdk   -w -Werror=date-time -DSPEC_CPU -DSPEC_CPU_MACOSX -DSPEC_CPU_LITTLEENDIAN -DSPEC_CPU_LP64 -Wno-implicit-function-declaration        (followed by -MD/-MT/-MF and filenames)

> On Jan 28, 2018, at 7:33 AM, Chad Rosier via Phabricator <reviews at reviews.llvm.org> wrote:
> 
> mcrosier added a comment.
> 
> Would it be possible to revert r322917 while we investigate the regressions?  We also identified a 3.61% regression in SPEC2006/bzip2, so here's to complete list of regressions we are currently seeing due to this change:
> 
> With -O3 -fno-math-errno -ffp-contract=fast -fomit-frame-pointer -mcpu=falkor:
> 
>  Spec2006/astar -3.25%
>  Spec2006/bzip2 -3.61%
>  Spec2006/povray -5.28%
>  Spec2017/povray -6.08%
> 
> With -O3 -flto -fuse-ld=gold -fno-math-errno -ffp-contract=fast -fwhole-program-vtables -fvisibility=hidden -fomit-frame-pointer -mcpu=falkor:
> 
>  Spec2006/astar -4.20%
>  Spec2006/h264ref -2.15%
> 
> 
> All tests were run on Falkor, but hopefully these issues can be reproduced on other targets.  Please let us know if you need any assistance reproducing, Matthias.
> 
> Chad
> 
> 
> Repository:
>  rL LLVM
> 
> https://reviews.llvm.org/D42006
> 
> 
> 



More information about the llvm-commits mailing list