[llvm] 8e3f59b - [AArch64] Add option to enable/disable load-store renaming.

Maxim Kuvyrkov via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 17 08:39:44 PST 2020


Hi Florian,

This change regresses performance of 464.h264ref (from SPEC CPU2006) by 6% on aarch64-linux-gnu.  This happens due to 40% regression in FastFullPelBlockMotionSearch().

This patch looks like a no-op, but, apparently, it's not.  Is it expected that it's not a no-op?

I see same regression on the release branch after backport of this patch as well.

Would you please investigate?

Let me know if you need help reproducing this.  I have pre-processed source, generated assembly and perf profiles handy.

Thanks!

--
Maxim Kuvyrkov
https://www.linaro.org

> On Jan 28, 2020, at 2:17 AM, Florian Hahn via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> 
> Author: Florian Hahn
> Date: 2020-01-27T15:15:50-08:00
> New Revision: 8e3f59b45ae185cc9b4e3a817d7ac958f1d55976
> 
> URL: https://github.com/llvm/llvm-project/commit/8e3f59b45ae185cc9b4e3a817d7ac958f1d55976
> DIFF: https://github.com/llvm/llvm-project/commit/8e3f59b45ae185cc9b4e3a817d7ac958f1d55976.diff
> 
> LOG: [AArch64] Add option to enable/disable load-store renaming.
> 
> This patch adds a new option to enable/disable register renaming in the
> load-store optimizer. Defaults to disabled, as there is a potential
> mis-compile caused by this.
> 
> Added: 
> 
> 
> Modified: 
>    llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
>    llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll
>    llvm/test/CodeGen/AArch64/arm64-abi_align.ll
>    llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll
>    llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll
>    llvm/test/CodeGen/AArch64/machine-outliner.ll
>    llvm/test/CodeGen/AArch64/stp-opt-with-renaming-debug.mir
>    llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir
>    llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir
> 
> Removed: 
> 
> 
> 
> ################################################################################
> diff  --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
> index bc91d628f0b4..cbca29b63b70 100644
> --- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
> +++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
> @@ -66,6 +66,10 @@ static cl::opt<unsigned> LdStLimit("aarch64-load-store-scan-limit",
> static cl::opt<unsigned> UpdateLimit("aarch64-update-scan-limit", cl::init(100),
>                                      cl::Hidden);
> 
> +// Enable register renaming to find additional store pairing opportunities.
> +static cl::opt<bool> EnableRenaming("aarch64-load-store-renaming",
> +                                    cl::init(false), cl::Hidden);
> +
> #define AARCH64_LOAD_STORE_OPT_NAME "AArch64 load / store optimization pass"
> 
> namespace {
> @@ -1446,6 +1450,9 @@ AArch64LoadStoreOpt::findMatchingInsn(MachineBasicBlock::iterator I,
>   bool IsPromotableZeroStore = isPromotableZeroStoreInst(FirstMI);
> 
>   Optional<bool> MaybeCanRename = None;
> +  if (!EnableRenaming)
> +    MaybeCanRename = {false};
> +
>   SmallPtrSet<const TargetRegisterClass *, 5> RequiredClasses;
>   LiveRegUnits UsedInBetween;
>   UsedInBetween.init(*TRI);
> 
> diff  --git a/llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll b/llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll
> index 7caa4c06d698..e904e86d8e6f 100644
> --- a/llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll
> +++ b/llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll
> @@ -1,5 +1,5 @@
> ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
> -; RUN: llc < %s -mtriple=arm64-apple-ios7.0.0 -mcpu=cyclone -enable-misched=false | FileCheck %s
> +; RUN: llc -aarch64-load-store-renaming=true < %s -mtriple=arm64-apple-ios7.0.0 -mcpu=cyclone -enable-misched=false | FileCheck %s
> 
> ; rdar://13625505
> ; Here we have 9 fixed integer arguments the 9th argument in on stack, the
> 
> diff  --git a/llvm/test/CodeGen/AArch64/arm64-abi_align.ll b/llvm/test/CodeGen/AArch64/arm64-abi_align.ll
> index 189546a4553b..b76d453c630a 100644
> --- a/llvm/test/CodeGen/AArch64/arm64-abi_align.ll
> +++ b/llvm/test/CodeGen/AArch64/arm64-abi_align.ll
> @@ -1,5 +1,5 @@
> -; RUN: llc -fast-isel-sink-local-values < %s -mtriple=arm64-apple-darwin -mcpu=cyclone -enable-misched=false -frame-pointer=all | FileCheck %s
> -; RUN: llc -fast-isel-sink-local-values < %s -mtriple=arm64-apple-darwin -O0 -frame-pointer=all -fast-isel | FileCheck -check-prefix=FAST %s
> +; RUN: llc -fast-isel-sink-local-values -aarch64-load-store-renaming=true < %s -mtriple=arm64-apple-darwin -mcpu=cyclone -enable-misched=false -frame-pointer=all | FileCheck %s
> +; RUN: llc -fast-isel-sink-local-values -aarch64-load-store-renaming=true  < %s -mtriple=arm64-apple-darwin -O0 -frame-pointer=all -fast-isel | FileCheck -check-prefix=FAST %s
> 
> ; rdar://12648441
> ; Generated from arm64-arguments.c with -O2.
> 
> diff  --git a/llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll b/llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll
> index 5f4c17d0cfba..94c6d69e329f 100644
> --- a/llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll
> +++ b/llvm/test/CodeGen/AArch64/arm64-variadic-aapcs.ll
> @@ -1,4 +1,4 @@
> -; RUN: llc -verify-machineinstrs -mtriple=arm64-linux-gnu -pre-RA-sched=linearize -enable-misched=false -disable-post-ra < %s | FileCheck %s
> +; RUN: llc -aarch64-load-store-renaming=true -verify-machineinstrs -mtriple=arm64-linux-gnu -pre-RA-sched=linearize -enable-misched=false -disable-post-ra < %s | FileCheck %s
> 
> %va_list = type {i8*, i8*, i8*, i32, i32}
> 
> 
> diff  --git a/llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll b/llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll
> index ce768adbfeaf..50b4ed3af6f9 100644
> --- a/llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll
> +++ b/llvm/test/CodeGen/AArch64/machine-outliner-remarks.ll
> @@ -4,7 +4,7 @@
> ; CHECK-SAME: Bytes from outlining all occurrences (16) >=
> ; CHECK-SAME: Unoutlined instruction bytes (16)
> ; CHECK-SAME: (Also found at: <UNKNOWN LOCATION>)
> -; CHECK: remark: <unknown>:0:0: Saved 36 bytes by outlining 11 instructions
> +; CHECK: remark: <unknown>:0:0: Saved 48 bytes by outlining 14 instructions
> ; CHECK-SAME: from 2 locations. (Found at: <UNKNOWN LOCATION>,
> ; CHECK-SAME: <UNKNOWN LOCATION>)
> ; RUN: llc %s -enable-machine-outliner -mtriple=aarch64-unknown-unknown -o /dev/null -pass-remarks-missed=machine-outliner -pass-remarks-output=%t.yaml
> @@ -38,10 +38,10 @@
> ; YAML-NEXT: Function:        OUTLINED_FUNCTION_0
> ; YAML-NEXT: Args:
> ; YAML-NEXT:   - String:          'Saved '
> -; YAML-NEXT:   - OutliningBenefit: '36'
> +; YAML-NEXT:   - OutliningBenefit: '48'
> ; YAML-NEXT:   - String:          ' bytes by '
> ; YAML-NEXT:   - String:          'outlining '
> -; YAML-NEXT:   - Length:          '11'
> +; YAML-NEXT:   - Length:          '14'
> ; YAML-NEXT:   - String:          ' instructions '
> ; YAML-NEXT:   - String:          'from '
> ; YAML-NEXT:   - NumOccurrences:  '2'
> 
> diff  --git a/llvm/test/CodeGen/AArch64/machine-outliner.ll b/llvm/test/CodeGen/AArch64/machine-outliner.ll
> index 13a12f76695d..6c76f894c856 100644
> --- a/llvm/test/CodeGen/AArch64/machine-outliner.ll
> +++ b/llvm/test/CodeGen/AArch64/machine-outliner.ll
> @@ -1,5 +1,5 @@
> -; RUN: llc -verify-machineinstrs -enable-machine-outliner -mtriple=aarch64-apple-darwin < %s | FileCheck %s
> -; RUN: llc -verify-machineinstrs -enable-machine-outliner -mtriple=aarch64-apple-darwin -mcpu=cortex-a53 -enable-misched=false < %s | FileCheck %s
> +; RUN: llc -verify-machineinstrs -enable-machine-outliner -aarch64-load-store-renaming=true -mtriple=aarch64-apple-darwin < %s | FileCheck %s
> +; RUN: llc -verify-machineinstrs -enable-machine-outliner -aarch64-load-store-renaming=true -mtriple=aarch64-apple-darwin -mcpu=cortex-a53 -enable-misched=false < %s | FileCheck %s
> ; RUN: llc -verify-machineinstrs -enable-machine-outliner -enable-linkonceodr-outlining -mtriple=aarch64-apple-darwin < %s | FileCheck %s -check-prefix=ODR
> ; RUN: llc -verify-machineinstrs -enable-machine-outliner -mtriple=aarch64-apple-darwin -stop-after=machine-outliner < %s | FileCheck %s -check-prefix=TARGET_FEATURES
> 
> 
> diff  --git a/llvm/test/CodeGen/AArch64/stp-opt-with-renaming-debug.mir b/llvm/test/CodeGen/AArch64/stp-opt-with-renaming-debug.mir
> index 8f573b67c524..17a9c47a588a 100644
> --- a/llvm/test/CodeGen/AArch64/stp-opt-with-renaming-debug.mir
> +++ b/llvm/test/CodeGen/AArch64/stp-opt-with-renaming-debug.mir
> @@ -1,4 +1,4 @@
> -# RUN: llc -run-pass=aarch64-ldst-opt -mtriple=arm64-apple-iphoneos -verify-machineinstrs -o - %s | FileCheck %s
> +# RUN: llc -run-pass=aarch64-ldst-opt -mtriple=arm64-apple-iphoneos -aarch64-load-store-renaming=true -verify-machineinstrs -o - %s | FileCheck %s
> --- |
>   define void @test_dbg_value1() #0 { ret void }
>   define void @test_dbg_value2() #0 { ret void }
> 
> diff  --git a/llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir b/llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir
> index 6a8fae0f7419..38e770eeb404 100644
> --- a/llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir
> +++ b/llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir
> @@ -1,9 +1,9 @@
> -# RUN: llc -run-pass=aarch64-ldst-opt -mattr=+reserve-x10 -mattr=+reserve-x11 \
> -# RUN:     -mattr=+reserve-x15 -mtriple=arm64-apple-iphoneos -verify-machineinstrs \
> +# RUN: llc -run-pass=aarch64-ldst-opt -aarch64-load-store-renaming=true -mattr=+reserve-x10 \
> +# RUN:     -mattr=+reserve-x11 -mattr=+reserve-x15 -mtriple=arm64-apple-iphoneos -verify-machineinstrs \
> # RUN:     -o - %s | FileCheck --check-prefix=CHECK --check-prefix=PRESERVED %s
> 
> -# RUN: llc -run-pass=aarch64-ldst-opt -mtriple=arm64-apple-iphoneos -verify-machineinstrs \
> -# RUN:     -o - %s | FileCheck --check-prefix=CHECK --check-prefix=NOPRES %s
> +# RUN: llc -run-pass=aarch64-ldst-opt -aarch64-load-store-renaming=true -mtriple=arm64-apple-iphoneos \
> +# RUN:     -verify-machineinstrs -o - %s | FileCheck --check-prefix=CHECK --check-prefix=NOPRES %s
> 
> 
> # Make sure we do not pick reserved registers. For test1, we would pick x10,
> 
> diff  --git a/llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir b/llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir
> index b57e32338f07..d9753d71038e 100644
> --- a/llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir
> +++ b/llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir
> @@ -1,5 +1,11 @@
> -# RUN: llc -run-pass=aarch64-ldst-opt -mtriple=arm64-apple-iphoneos -verify-machineinstrs -o - %s | FileCheck %s
> +# RUN: llc -run-pass=aarch64-ldst-opt -mtriple=arm64-apple-iphoneos -verify-machineinstrs -aarch64-load-store-renaming=true -o - %s | FileCheck %s
> +# RUN: llc -run-pass=aarch64-ldst-opt -mtriple=arm64-apple-iphoneos -verify-machineinstrs -o - %s | FileCheck --check-prefix=NO-RENAME %s
> 
> +# NO-RENAME-NOT: STP
> +# NO-RENAME:     test12
> +# NO-RENAME:     STP
> +# NO-RENAME-NOT: STP
> +#
> ---
> # CHECK-LABEL: name: test1
> # CHECK: bb.0:
> 
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list