[PATCH] [X86] Limit maximum nop length on Silvermont
Nadav Rotem
nrotem at apple.com
Thu Jul 3 14:22:54 PDT 2014
LGTM.
> On Jul 3, 2014, at 7:28 AM, Alexey Volkov <avolkov.intel at gmail.com> wrote:
>
> Hi nadav,
>
> Silvermont can only decode one instruction per cycle if the instruction exceeds 8 bytes.
> Also in Silvermont instructions with more than 3 prefixes will cause 3 cycle penalty.
> This patch introduces maximum nop length and limits it to 7 bytes when used for padding on Silvermont.
> For other x86 processors max nop length remains unchanged 15 bytes.
>
> http://reviews.llvm.org/D4374
>
> Files:
> lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
> test/MC/X86/x86_long_nop.s
> test/MC/X86/x86_nop.s
>
> Index: lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
> ===================================================================
> --- lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
> +++ lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
> @@ -73,11 +73,12 @@
> };
>
> class X86AsmBackend : public MCAsmBackend {
> - StringRef CPU;
> + const StringRef CPU;
> bool HasNopl;
> + const uint64_t MaxNopLength;
> public:
> X86AsmBackend(const Target &T, StringRef _CPU)
> - : MCAsmBackend(), CPU(_CPU) {
> + : MCAsmBackend(), CPU(_CPU), MaxNopLength(_CPU == "slm" ? 7 : 15) {
> HasNopl = CPU != "generic" && CPU != "i386" && CPU != "i486" &&
> CPU != "i586" && CPU != "pentium" && CPU != "pentium-mmx" &&
> CPU != "i686" && CPU != "k6" && CPU != "k6-2" && CPU != "k6-3" &&
> @@ -331,7 +332,7 @@
> // 15 is the longest single nop instruction. Emit as many 15-byte nops as
> // needed, then emit a nop of the remaining length.
> do {
> - const uint8_t ThisNopLength = (uint8_t) std::min(Count, (uint64_t) 15);
> + const uint8_t ThisNopLength = (uint8_t) std::min(Count, MaxNopLength);
> const uint8_t Prefixes = ThisNopLength <= 10 ? 0 : ThisNopLength - 10;
> for (uint8_t i = 0; i < Prefixes; i++)
> OW->Write8(0x66);
> Index: test/MC/X86/x86_long_nop.s
> ===================================================================
> --- test/MC/X86/x86_long_nop.s
> +++ test/MC/X86/x86_long_nop.s
> @@ -2,6 +2,7 @@
> # RUN: llvm-mc -filetype=obj -arch=x86 -triple=i686-pc-linux-gnu %s | llvm-objdump -d -no-show-raw-insn - | FileCheck %s
> # RUN: llvm-mc -filetype=obj -arch=x86 -triple=x86_64-apple-darwin10.0 %s | llvm-objdump -d -no-show-raw-insn - | FileCheck %s
> # RUN: llvm-mc -filetype=obj -arch=x86 -triple=i686-apple-darwin8 %s | llvm-objdump -d -no-show-raw-insn - | FileCheck %s
> +# RUN: llvm-mc -filetype=obj -arch=x86 -triple=i686-pc-linux-gnu -mcpu=slm %s | llvm-objdump -d -no-show-raw-insn - | FileCheck --check-prefix=SLM %s
>
> # Ensure alignment directives also emit sequences of 15-byte NOPs on processors
> # capable of using long NOPs.
> @@ -13,3 +14,12 @@
> # CHECK-NEXT: 10: nop
> # CHECK-NEXT: 1f: nop
> # CHECK-NEXT: 20: inc
> +
> +# On Silvermont we emit only 7 byte NOPs since longer NOPs are not profitable
> +# SLM: 0: inc
> +# SLM-NEXT: 1: nop
> +# SLM-NEXT: 8: nop
> +# SLM-NEXT: f: nop
> +# SLM-NEXT: 16: nop
> +# SLM-NEXT: 1d: nop
> +# SLM-NEXT: 20: inc
> Index: test/MC/X86/x86_nop.s
> ===================================================================
> --- test/MC/X86/x86_nop.s
> +++ test/MC/X86/x86_nop.s
> @@ -14,6 +14,7 @@
> # RUN: llvm-mc -filetype=obj -triple=i686-pc-linux -mcpu=c3 %s | llvm-objdump -d - | FileCheck %s
> # RUN: llvm-mc -filetype=obj -triple=i686-pc-linux -mcpu=c3-2 %s | llvm-objdump -d - | FileCheck %s
> # RUN: llvm-mc -filetype=obj -triple=i686-pc-linux -mcpu=core2 %s | llvm-objdump -d - | FileCheck --check-prefix=NOPL %s
> +# RUN: llvm-mc -filetype=obj -triple=i686-pc-linux -mcpu=slm %s | llvm-objdump -d - | FileCheck --check-prefix=NOPL %s
>
>
> inc %eax
> <D4374.11054.patch>
More information about the llvm-commits
mailing list