[PATCH] D36749: [LLD][ELF][AArch64] Complete implementation of -fix-cortex-a53-843419
Rafael Avila de Espindola via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 14 20:07:47 PST 2017
LGTM
Peter Smith via Phabricator <reviews at reviews.llvm.org> writes:
> peter.smith updated this revision to Diff 126368.
> peter.smith added a comment.
>
> Comments from Rafael
>
>> These comment updates are fine. Please commit them first.
>
> Ok, committed in r320372
>
>>> +class SectionPatcher {
>>
>> Maybe call it AArch843419Patcher?
>
> I've gone for AArch64ErrPatcher.
>
>>> + // Apply any relocation transferred from the original PatcheeSection.
>>> + // For a SyntheticSection Buf already has OutSecOff added, but relocateAlloc
>>> + // also adds OutSecOff so we need to subtract to avoid double counting.
>>> + this->relocateAlloc(Buf - OutSecOff, Buf - OutSecOff + getSize());
>>
>> I wonder if we could read the already patched output buffer and avoid
>> this? Can we guarantee that the patch is always written after the patchee?
>
> The patch will always be later in the list of InputSections than the patchee, but if the writing of InputSections is multithreaded then I don't think we could. In any case with the current implementation we use a branch relocation on the location of the original patchee instruction to mutate it into a branch. I think that if we were to go down that route we'd need to do something like Gold, it does 2 passes over the relocations, once for non-patch sections and once for patch sections, with the patch section pass responsible for mutating the patchee instruction into a branch after copying it into the patch.
>
>>> + std::merge(ISD.Sections.begin(), ISD.Sections.end(), Patches.begin(),
>>> + Patches.end(), std::back_inserter(Tmp), MergeCmp);
>>
>> After this the patch sections still have a OutSecOff of the limit of
>> where they can be placed, no? Is that OK?
>
> Yes as we only merge into the InputSectionDescription->Sections once using OutSecOff in the comparison function for the merge. At the end of the pass assignAddresses() will give each patch the correct OutSecOff.
>
>>> +void SectionPatcher::patchInputSectionDescription(
>>> + InputSectionDescription &ISD, std::vector<Patch843419Section *> &Patches) {
>>
>> return the std::vector.
>
> Ok, done.
>
>
> https://reviews.llvm.org/D36749
>
> Files:
> ELF/AArch64ErrataFix.cpp
> ELF/AArch64ErrataFix.h
> ELF/Writer.cpp
> test/ELF/aarch64-cortex-a53-843419-address.s
> test/ELF/aarch64-cortex-a53-843419-large.s
> test/ELF/aarch64-cortex-a53-843419-recognize.s
> test/ELF/aarch64-cortex-a53-843419-thunk.s
>
> Index: test/ELF/aarch64-cortex-a53-843419-thunk.s
> ===================================================================
> --- test/ELF/aarch64-cortex-a53-843419-thunk.s
> +++ test/ELF/aarch64-cortex-a53-843419-thunk.s
> @@ -4,6 +4,7 @@
> // RUN: .text1 0x10000 : { *(.text.01) *(.text.02) *(.text.03) } \
> // RUN: .text2 0x100000000 : { *(.text.04) } } " > %t.script
> // RUN: ld.lld --script %t.script -fix-cortex-a53-843419 -verbose %t.o -o %t2 | FileCheck -check-prefix=CHECK-PRINT %s
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 | FileCheck %s
>
> // Test cases for Cortex-A53 Erratum 843419 that involve interactions with
> // range extension thunks. Both erratum fixes and range extension thunks need
> @@ -33,6 +34,15 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 10FFC in unpatched output.
> +// CHECK: t3_ff8_ldr:
> +// CHECK-NEXT: 10ffc: 80 ff 7f 90 adrp x0, #4294901760
> +// CHECK-NEXT: 11000: 21 00 40 f9 ldr x1, [x1]
> +// CHECK-NEXT: 11004: 02 00 00 14 b #8
> +// CHECK-NEXT: 11008: c0 03 5f d6 ret
> +// CHECK: __CortexA53843419_11004:
> +// CHECK-NEXT: 1100c: 00 08 40 f9 ldr x0, [x0, #16]
> +// CHECK-NEXT: 11010: fe ff ff 17 b #-8
> +
> .section .text.04, "ax", %progbits
> .globl far_away
> .type far_away, function
> Index: test/ELF/aarch64-cortex-a53-843419-recognize.s
> ===================================================================
> --- test/ELF/aarch64-cortex-a53-843419-recognize.s
> +++ test/ELF/aarch64-cortex-a53-843419-recognize.s
> @@ -1,30 +1,38 @@
> // REQUIRES: aarch64
> // RUN: llvm-mc -filetype=obj -triple=aarch64-none-linux %s -o %t.o
> // RUN: ld.lld -fix-cortex-a53-843419 -verbose %t.o -o %t2 | FileCheck -check-prefix CHECK-PRINT %s
> -
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 | FileCheck %s -check-prefixes=CHECK,CHECK-FIX
> +// RUN: ld.lld -verbose %t.o -o %t3
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t3 | FileCheck %s -check-prefixes=CHECK,CHECK-NOFIX
> // Test cases for Cortex-A53 Erratum 843419
> // See ARM-EPM-048406 Cortex_A53_MPCore_Software_Developers_Errata_Notice.pdf
> // for full erratum details.
> // In Summary
> // 1.)
> -// ADRP (0xff8 or 0xffc)
> +// ADRP (0xff8 or 0xffc).
> // 2.)
> -// - load or store single register or either integer or vector registers
> -// - STP or STNP of either vector or vector registers
> -// - Advanced SIMD ST1 store instruction
> -// Must not write Rn
> -// 3.) optional instruction, can't be a branch, must not write Rn, may read Rn
> +// - load or store single register or either integer or vector registers.
> +// - STP or STNP of either vector or vector registers.
> +// - Advanced SIMD ST1 store instruction.
> +// - Must not write Rn.
> +// 3.) optional instruction, can't be a branch, must not write Rn, may read Rn.
> // 4.) A load or store instruction from the Load/Store register unsigned
> -// immediate class using Rn as the base register
> +// immediate class using Rn as the base register.
>
> // Each section contains a sequence of instructions that should be recognized
> // as erratum 843419. The test cases cover the major variations such as:
> -// adrp starts at 0xfff8 or 0xfffc
> -// Variations in instruction class for instruction 2
> -// Optional instruction 3 present or not
> -// Load or store for instruction 4.
> +// - adrp starts at 0xfff8 or 0xfffc.
> +// - Variations in instruction class for instruction 2.
> +// - Optional instruction 3 present or not.
> +// - Load or store for instruction 4.
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 21FF8 in unpatched output.
> +// CHECK: t3_ff8_ldr:
> +// CHECK-NEXT: 21ff8: e0 01 00 f0 adrp x0, #258048
> +// CHECK-NEXT: 21ffc: 21 00 40 f9 ldr x1, [x1]
> +// CHECK-FIX: 22000: 03 b8 00 14 b #188428
> +// CHECK-NOFIX: 22000: 00 00 40 f9 ldr x0, [x0]
> +// CHECK-NEXT: 22004: c0 03 5f d6 ret
> .section .text.01, "ax", %progbits
> .balign 4096
> .globl t3_ff8_ldr
> @@ -37,6 +45,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 23FF8 in unpatched output.
> +// CHECK: t3_ff8_ldrsimd:
> +// CHECK-NEXT: 23ff8: e0 01 00 b0 adrp x0, #249856
> +// CHECK-NEXT: 23ffc: 21 00 40 bd ldr s1, [x1]
> +// CHECK-FIX: 24000: 05 b0 00 14 b #180244
> +// CHECK-NOFIX: 24000: 02 04 40 f9 ldr x2, [x0, #8]
> +// CHECK-NEXT: 24004: c0 03 5f d6 ret
> .section .text.02, "ax", %progbits
> .balign 4096
> .globl t3_ff8_ldrsimd
> @@ -49,6 +63,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 25FFC in unpatched output.
> +// CHECK: t3_ffc_ldrpost:
> +// CHECK-NEXT: 25ffc: c0 01 00 f0 adrp x0, #241664
> +// CHECK-NEXT: 26000: 21 84 40 bc ldr s1, [x1], #8
> +// CHECK-FIX: 26004: 06 a8 00 14 b #172056
> +// CHECK-NOFIX: 26004: 03 08 40 f9 ldr x3, [x0, #16]
> +// CHECK-NEXT: 26008: c0 03 5f d6 ret
> .section .text.03, "ax", %progbits
> .balign 4096
> .globl t3_ffc_ldrpost
> @@ -61,6 +81,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 27FF8 in unpatched output.
> +// CHECK: t3_ff8_strpre:
> +// CHECK-NEXT: 27ff8: c0 01 00 b0 adrp x0, #233472
> +// CHECK-NEXT: 27ffc: 21 8c 00 bc str s1, [x1, #8]!
> +// CHECK-FIX: 28000: 09 a0 00 14 b #163876
> +// CHECK-NOFIX: 28000: 02 00 40 f9 ldr x2, [x0]
> +// CHECK-NEXT: 28004: c0 03 5f d6 ret
> .section .text.04, "ax", %progbits
> .balign 4096
> .globl t3_ff8_strpre
> @@ -73,6 +99,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 29FFC in unpatched output.
> +// CHECK: t3_ffc_str:
> +// CHECK-NEXT: 29ffc: bc 01 00 f0 adrp x28, #225280
> +// CHECK-NEXT: 2a000: 42 00 00 f9 str x2, [x2]
> +// CHECK-FIX: 2a004: 0a 98 00 14 b #155688
> +// CHECK-NOFIX: 2a004: 9c 07 00 f9 str x28, [x28, #8]
> +// CHECK-NEXT: 2a008: c0 03 5f d6 ret
> .section .text.05, "ax", %progbits
> .balign 4096
> .globl t3_ffc_str
> @@ -85,6 +117,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 2BFFC in unpatched output.
> +// CHECK: t3_ffc_strsimd:
> +// CHECK-NEXT: 2bffc: bc 01 00 b0 adrp x28, #217088
> +// CHECK-NEXT: 2c000: 44 00 00 b9 str w4, [x2]
> +// CHECK-FIX: 2c004: 0c 90 00 14 b #147504
> +// CHECK-NOFIX: 2c004: 84 0b 00 f9 str x4, [x28, #16]
> +// CHECK-NEXT: 2c008: c0 03 5f d6 ret
> .section .text.06, "ax", %progbits
> .balign 4096
> .globl t3_ffc_strsimd
> @@ -97,6 +135,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 2DFF8 in unpatched output.
> +// CHECK: t3_ff8_ldrunpriv:
> +// CHECK-NEXT: 2dff8: 9d 01 00 f0 adrp x29, #208896
> +// CHECK-NEXT: 2dffc: 41 08 40 38 ldtrb w1, [x2]
> +// CHECK-FIX: 2e000: 0f 88 00 14 b #139324
> +// CHECK-NOFIX: 2e000: bd 03 40 f9 ldr x29, [x29]
> +// CHECK-NEXT: 2e004: c0 03 5f d6 ret
> .section .text.07, "ax", %progbits
> .balign 4096
> .globl t3_ff8_ldrunpriv
> @@ -109,7 +153,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 2FFFC in unpatched output.
> - .section .text.08, "ax", %progbits
> +// CHECK: t3_ffc_ldur:
> +// CHECK-NEXT: 2fffc: 9d 01 00 b0 adrp x29, #200704
> +// CHECK-NEXT: 30000: 42 40 40 b8 ldur w2, [x2, #4]
> +// CHECK-FIX: 30004: 10 80 00 14 b #131136
> +// CHECK-NOFIX: 30004: bd 07 40 f9 ldr x29, [x29, #8]
> +// CHECK-NEXT: 30008: c0 03 5f d6 ret
> .balign 4096
> .globl t3_ffc_ldur
> .type t3_ffc_ldur, %function
> @@ -121,6 +170,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 31FFC in unpatched output.
> +// CHECK: t3_ffc_sturh:
> +// CHECK-NEXT: 31ffc: 72 01 00 f0 adrp x18, #192512
> +// CHECK-NEXT: 32000: 43 40 00 78 sturh w3, [x2, #4]
> +// CHECK-FIX: 32004: 12 78 00 14 b #122952
> +// CHECK-NOFIX: 32004: 41 0a 40 f9 ldr x1, [x18, #16]
> +// CHECK-NEXT: 32008: c0 03 5f d6 ret
> .section .text.09, "ax", %progbits
> .balign 4096
> .globl t3_ffc_sturh
> @@ -133,6 +188,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 33FF8 in unpatched output.
> +// CHECK: t3_ff8_literal:
> +// CHECK-NEXT: 33ff8: 72 01 00 b0 adrp x18, #184320
> +// CHECK-NEXT: 33ffc: e3 ff ff 58 ldr x3, #-4
> +// CHECK-FIX: 34000: 15 70 00 14 b #114772
> +// CHECK-NOFIX: 34000: 52 02 40 f9 ldr x18, [x18]
> +// CHECK-NEXT: 34004: c0 03 5f d6 ret
> .section .text.10, "ax", %progbits
> .balign 4096
> .globl t3_ff8_literal
> @@ -145,6 +206,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 35FFC in unpatched output.
> +// CHECK: t3_ffc_register:
> +// CHECK-NEXT: 35ffc: 4f 01 00 f0 adrp x15, #176128
> +// CHECK-NEXT: 36000: 43 68 61 f8 ldr x3, [x2, x1]
> +// CHECK-FIX: 36004: 16 68 00 14 b #106584
> +// CHECK-NOFIX: 36004: ea 05 40 f9 ldr x10, [x15, #8]
> +// CHECK-NEXT: 36008: c0 03 5f d6 ret
> .section .text.11, "ax", %progbits
> .balign 4096
> .globl t3_ffc_register
> @@ -157,6 +224,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 37FF8 in unpatched output.
> +// CHECK: t3_ff8_stp:
> +// CHECK-NEXT: 37ff8: 50 01 00 b0 adrp x16, #167936
> +// CHECK-NEXT: 37ffc: 61 08 00 a9 stp x1, x2, [x3]
> +// CHECK-FIX: 38000: 19 60 00 14 b #98404
> +// CHECK-NOFIX: 38000: 0d 0a 40 f9 ldr x13, [x16, #16]
> +// CHECK-NEXT: 38004: c0 03 5f d6 ret
> .section .text.12, "ax", %progbits
> .balign 4096
> .globl t3_ff8_stp
> @@ -169,6 +242,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 39FFC in unpatched output.
> +// CHECK: t3_ffc_stnp:
> +// CHECK-NEXT: 39ffc: 27 01 00 f0 adrp x7, #159744
> +// CHECK-NEXT: 3a000: 61 08 00 a8 stnp x1, x2, [x3]
> +// CHECK-FIX: 3a004: 1a 58 00 14 b #90216
> +// CHECK-NOFIX: 3a004: e9 00 40 f9 ldr x9, [x7]
> +// CHECK-NEXT: 3a008: c0 03 5f d6 ret
> .section .text.13, "ax", %progbits
> .balign 4096
> .globl t3_ffc_stnp
> @@ -181,6 +260,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 3BFFC in unpatched output.
> +// CHECK: t3_ffc_st1singlepost:
> +// CHECK-NEXT: 3bffc: 37 01 00 b0 adrp x23, #151552
> +// CHECK-NEXT: 3c000: 20 70 82 4c st1 { v0.16b }, [x1], x2
> +// CHECK-FIX: 3c004: 1c 50 00 14 b #82032
> +// CHECK-NOFIX: 3c004: f6 06 40 f9 ldr x22, [x23, #8]
> +// CHECK-NEXT: 3c008: c0 03 5f d6 ret
> .section .text.14, "ax", %progbits
> .balign 4096
> .globl t3_ffc_st1singlepost
> @@ -193,6 +278,12 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 3DFF8 in unpatched output.
> +// CHECK: t3_ff8_st1multiple:
> +// CHECK-NEXT: 3dff8: 17 01 00 f0 adrp x23, #143360
> +// CHECK-NEXT: 3dffc: 20 a0 00 4c st1 { v0.16b, v1.16b }, [x1]
> +// CHECK-FIX: 3e000: 1f 48 00 14 b #73852
> +// CHECK-NOFIX: 3e000: f8 0a 40 f9 ldr x24, [x23, #16]
> +// CHECK-NEXT: 3e004: c0 03 5f d6 ret
> .section .text.15, "ax", %progbits
> .balign 4096
> .globl t3_ff8_st1multiple
> @@ -205,6 +296,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 3FFF8 in unpatched output.
> +// CHECK: t4_ff8_ldr:
> +// CHECK-NEXT: 3fff8: 00 01 00 b0 adrp x0, #135168
> +// CHECK-NEXT: 3fffc: 21 00 40 f9 ldr x1, [x1]
> +// CHECK-NEXT: 40000: 42 00 00 8b add x2, x2, x0
> +// CHECK-FIX: 40004: 20 40 00 14 b #65664
> +// CHECK-NOFIX: 40004: 02 00 40 f9 ldr x2, [x0]
> +// CHECK-NEXT: 40008: c0 03 5f d6 ret
> .section .text.16, "ax", %progbits
> .balign 4096
> .globl t4_ff8_ldr
> @@ -218,6 +316,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 41FFC in unpatched output.
> +// CHECK: t4_ffc_str:
> +// CHECK-NEXT: 41ffc: fc 00 00 f0 adrp x28, #126976
> +// CHECK-NEXT: 42000: 42 00 00 f9 str x2, [x2]
> +// CHECK-NEXT: 42004: 20 00 02 cb sub x0, x1, x2
> +// CHECK-FIX: 42008: 21 38 00 14 b #57476
> +// CHECK-NOFIX: 42008: 9b 07 00 f9 str x27, [x28, #8]
> +// CHECK-NEXT: 4200c: c0 03 5f d6 ret
> .section .text.17, "ax", %progbits
> .balign 4096
> .globl t4_ffc_str
> @@ -231,6 +336,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 43FF8 in unpatched output.
> +// CHECK: t4_ff8_stp:
> +// CHECK-NEXT: 43ff8: f0 00 00 b0 adrp x16, #118784
> +// CHECK-NEXT: 43ffc: 61 08 00 a9 stp x1, x2, [x3]
> +// CHECK-NEXT: 44000: 03 7e 10 9b mul x3, x16, x16
> +// CHECK-FIX: 44004: 24 30 00 14 b #49296
> +// CHECK-NOFIX: 44004: 0e 0a 40 f9 ldr x14, [x16, #16]
> +// CHECK-NEXT: 44008: c0 03 5f d6 ret
> .section .text.18, "ax", %progbits
> .balign 4096
> .globl t4_ff8_stp
> @@ -244,6 +356,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 45FF8 in unpatched output.
> +// CHECK: t4_ff8_stppre:
> +// CHECK-NEXT: 45ff8: d0 00 00 f0 adrp x16, #110592
> +// CHECK-NEXT: 45ffc: 61 08 81 a9 stp x1, x2, [x3, #16]!
> +// CHECK-NEXT: 46000: 03 7e 10 9b mul x3, x16, x16
> +// CHECK-FIX: 46004: 26 28 00 14 b #41112
> +// CHECK-NOFIX: 46004: 0e 06 40 f9 ldr x14, [x16, #8]
> +// CHECK-NEXT: 46008: c0 03 5f d6 ret
> .section .text.19, "ax", %progbits
> .balign 4096
> .globl t4_ff8_stppre
> @@ -257,6 +376,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 47FF8 in unpatched output.
> +// CHECK: t4_ff8_stppost:
> +// CHECK-NEXT: 47ff8: d0 00 00 b0 adrp x16, #102400
> +// CHECK-NEXT: 47ffc: 61 08 81 a8 stp x1, x2, [x3], #16
> +// CHECK-NEXT: 48000: 03 7e 10 9b mul x3, x16, x16
> +// CHECK-FIX: 48004: 28 20 00 14 b #32928
> +// CHECK-NOFIX: 48004: 0e 06 40 f9 ldr x14, [x16, #8]
> +// CHECK-NEXT: 48008: c0 03 5f d6 ret
> .section .text.20, "ax", %progbits
> .balign 4096
> .globl t4_ff8_stppost
> @@ -270,6 +396,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 49FFC in unpatched output.
> +// CHECK: t4_ffc_stpsimd:
> +// CHECK-NEXT: 49ffc: b0 00 00 f0 adrp x16, #94208
> +// CHECK-NEXT: 4a000: 61 08 00 ad stp q1, q2, [x3]
> +// CHECK-NEXT: 4a004: 03 7e 10 9b mul x3, x16, x16
> +// CHECK-FIX: 4a008: 29 18 00 14 b #24740
> +// CHECK-NOFIX: 4a008: 0e 06 40 f9 ldr x14, [x16, #8]
> +// CHECK-NEXT: 4a00c: c0 03 5f d6 ret
> .section .text.21, "ax", %progbits
> .balign 4096
> .globl t4_ffc_stpsimd
> @@ -283,6 +416,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 4BFFC in unpatched output.
> +// CHECK: t4_ffc_stnp:
> +// CHECK-NEXT: 4bffc: a7 00 00 b0 adrp x7, #86016
> +// CHECK-NEXT: 4c000: 61 08 00 a8 stnp x1, x2, [x3]
> +// CHECK-NEXT: 4c004: 1f 20 03 d5 nop
> +// CHECK-FIX: 4c008: 2b 10 00 14 b #16556
> +// CHECK-NOFIX: 4c008: ea 00 40 f9 ldr x10, [x7]
> +// CHECK-NEXT: 4c00c: c0 03 5f d6 ret
> .section .text.22, "ax", %progbits
> .balign 4096
> .globl t4_ffc_stnp
> @@ -296,6 +436,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 4DFFC in unpatched output.
> +// CHECK: t4_ffc_st1:
> +// CHECK-NEXT: 4dffc: 98 00 00 f0 adrp x24, #77824
> +// CHECK-NEXT: 4e000: 20 70 00 4c st1 { v0.16b }, [x1]
> +// CHECK-NEXT: 4e004: f6 06 40 f9 ldr x22, [x23, #8]
> +// CHECK-FIX: 4e008: 2d 08 00 14 b #8372
> +// CHECK-NOFIX: 4e008: 18 ff 3f f9 str x24, [x24, #32760]
> +// CHECK-NEXT: 4e00c: c0 03 5f d6 ret
> .section .text.23, "ax", %progbits
> .balign 4096
> .globl t4_ffc_st1
> @@ -309,6 +456,13 @@
> ret
>
> // CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 4FFF8 in unpatched output.
> +// CHECK: t3_ff8_ldr_once:
> +// CHECK-NEXT: 4fff8: 80 00 00 b0 adrp x0, #69632
> +// CHECK-NEXT: 4fffc: 20 70 82 4c st1 { v0.16b }, [x1], x2
> +// CHECK-FIX: 50000: 31 00 00 14 b #196
> +// CHECK-NOFIX: 50000: 01 08 40 f9 ldr x1, [x0, #16]
> +// CHECK-NEXT: 50004: 02 08 40 f9 ldr x2, [x0, #16]
> +// CHECK-NEXT: 50008: c0 03 5f d6 ret
> .section .text.24, "ax", %progbits
> .balign 4096
> .globl t3_ff8_ldr_once
> @@ -327,6 +481,79 @@
> _start:
> ret
>
> +// CHECK-FIX: __CortexA53843419_22000:
> +// CHECK-FIX-NEXT: 5000c: 00 00 40 f9 ldr x0, [x0]
> +// CHECK-FIX-NEXT: 50010: fd 47 ff 17 b #-188428
> +// CHECK-FIX: __CortexA53843419_24000:
> +// CHECK-FIX-NEXT: 50014: 02 04 40 f9 ldr x2, [x0, #8]
> +// CHECK-FIX-NEXT: 50018: fb 4f ff 17 b #-180244
> +// CHECK-FIX: __CortexA53843419_26004:
> +// CHECK-FIX-NEXT: 5001c: 03 08 40 f9 ldr x3, [x0, #16]
> +// CHECK-FIX-NEXT: 50020: fa 57 ff 17 b #-172056
> +// CHECK-FIX: __CortexA53843419_28000:
> +// CHECK-FIX-NEXT: 50024: 02 00 40 f9 ldr x2, [x0]
> +// CHECK-FIX-NEXT: 50028: f7 5f ff 17 b #-163876
> +// CHECK-FIX: __CortexA53843419_2A004:
> +// CHECK-FIX-NEXT: 5002c: 9c 07 00 f9 str x28, [x28, #8]
> +// CHECK-FIX-NEXT: 50030: f6 67 ff 17 b #-155688
> +// CHECK-FIX: __CortexA53843419_2C004:
> +// CHECK-FIX-NEXT: 50034: 84 0b 00 f9 str x4, [x28, #16]
> +// CHECK-FIX-NEXT: 50038: f4 6f ff 17 b #-147504
> +// CHECK-FIX: __CortexA53843419_2E000:
> +// CHECK-FIX-NEXT: 5003c: bd 03 40 f9 ldr x29, [x29]
> +// CHECK-FIX-NEXT: 50040: f1 77 ff 17 b #-139324
> +// CHECK-FIX: __CortexA53843419_30004:
> +// CHECK-FIX-NEXT: 50044: bd 07 40 f9 ldr x29, [x29, #8]
> +// CHECK-FIX-NEXT: 50048: f0 7f ff 17 b #-131136
> +// CHECK-FIX: __CortexA53843419_32004:
> +// CHECK-FIX-NEXT: 5004c: 41 0a 40 f9 ldr x1, [x18, #16]
> +// CHECK-FIX-NEXT: 50050: ee 87 ff 17 b #-122952
> +// CHECK-FIX: __CortexA53843419_34000:
> +// CHECK-FIX-NEXT: 50054: 52 02 40 f9 ldr x18, [x18]
> +// CHECK-FIX-NEXT: 50058: eb 8f ff 17 b #-114772
> +// CHECK-FIX: __CortexA53843419_36004:
> +// CHECK-FIX-NEXT: 5005c: ea 05 40 f9 ldr x10, [x15, #8]
> +// CHECK-FIX-NEXT: 50060: ea 97 ff 17 b #-106584
> +// CHECK-FIX: __CortexA53843419_38000:
> +// CHECK-FIX-NEXT: 50064: 0d 0a 40 f9 ldr x13, [x16, #16]
> +// CHECK-FIX-NEXT: 50068: e7 9f ff 17 b #-98404
> +// CHECK-FIX: __CortexA53843419_3A004:
> +// CHECK-FIX-NEXT: 5006c: e9 00 40 f9 ldr x9, [x7]
> +// CHECK-FIX-NEXT: 50070: e6 a7 ff 17 b #-90216
> +// CHECK-FIX: __CortexA53843419_3C004:
> +// CHECK-FIX-NEXT: 50074: f6 06 40 f9 ldr x22, [x23, #8]
> +// CHECK-FIX-NEXT: 50078: e4 af ff 17 b #-82032
> +// CHECK-FIX: __CortexA53843419_3E000:
> +// CHECK-FIX-NEXT: 5007c: f8 0a 40 f9 ldr x24, [x23, #16]
> +// CHECK-FIX-NEXT: 50080: e1 b7 ff 17 b #-73852
> +// CHECK-FIX: __CortexA53843419_40004:
> +// CHECK-FIX-NEXT: 50084: 02 00 40 f9 ldr x2, [x0]
> +// CHECK-FIX-NEXT: 50088: e0 bf ff 17 b #-65664
> +// CHECK-FIX: __CortexA53843419_42008:
> +// CHECK-FIX-NEXT: 5008c: 9b 07 00 f9 str x27, [x28, #8]
> +// CHECK-FIX-NEXT: 50090: df c7 ff 17 b #-57476
> +// CHECK-FIX: __CortexA53843419_44004:
> +// CHECK-FIX-NEXT: 50094: 0e 0a 40 f9 ldr x14, [x16, #16]
> +// CHECK-FIX-NEXT: 50098: dc cf ff 17 b #-49296
> +// CHECK-FIX: __CortexA53843419_46004:
> +// CHECK-FIX-NEXT: 5009c: 0e 06 40 f9 ldr x14, [x16, #8]
> +// CHECK-FIX-NEXT: 500a0: da d7 ff 17 b #-41112
> +// CHECK-FIX: __CortexA53843419_48004:
> +// CHECK-FIX-NEXT: 500a4: 0e 06 40 f9 ldr x14, [x16, #8]
> +// CHECK-FIX-NEXT: 500a8: d8 df ff 17 b #-32928
> +// CHECK-FIX: __CortexA53843419_4A008:
> +// CHECK-FIX-NEXT: 500ac: 0e 06 40 f9 ldr x14, [x16, #8]
> +// CHECK-FIX-NEXT: 500b0: d7 e7 ff 17 b #-24740
> +// CHECK-FIX: __CortexA53843419_4C008:
> +// CHECK-FIX-NEXT: 500b4: ea 00 40 f9 ldr x10, [x7]
> +// CHECK-FIX-NEXT: 500b8: d5 ef ff 17 b #-16556
> +// CHECK-FIX: __CortexA53843419_4E008:
> +// CHECK-FIX-NEXT: 500bc: 18 ff 3f f9 str x24, [x24, #32760]
> +// CHECK-FIX-NEXT: 500c0: d3 f7 ff 17 b #-8372
> +// CHECK-FIX: __CortexA53843419_50000:
> +// CHECK-FIX-NEXT: 500c4: 01 08 40 f9 ldr x1, [x0, #16]
> +// CHECK-FIX-NEXT: 500c8: cf ff ff 17 b #-196
> +
> .data
> .globl dat
> .globl dat2
> Index: test/ELF/aarch64-cortex-a53-843419-large.s
> ===================================================================
> --- /dev/null
> +++ test/ELF/aarch64-cortex-a53-843419-large.s
> @@ -0,0 +1,115 @@
> +// REQUIRES: aarch64
> +// RUN: llvm-mc -filetype=obj -triple=aarch64-none-linux %s -o %t.o
> +// RUN: ld.lld --fix-cortex-a53-843419 %t.o -o %t2
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 -start-address=131072 -stop-address=131084 | FileCheck --check-prefix=CHECK1 %s
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 -start-address=135168 -stop-address=135172 | FileCheck --check-prefix=CHECK2 %s
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 -start-address=139256 -stop-address=139272 | FileCheck --check-prefix=CHECK3 %s
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 -start-address=67256312 -stop-address=67256328 | FileCheck --check-prefix=CHECK4 %s
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 -start-address=100810760 -stop-address=100810776 | FileCheck --check-prefix=CHECK5 %s
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 -start-address=134352908 -stop-address=134352912 | FileCheck --check-prefix=CHECK6 %s
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 -start-address=134356988 -stop-address=134357012 | FileCheck --check-prefix=CHECK7 %s
> +// Test case for Cortex-A53 Erratum 843419 in an OutputSection exceeding
> +// the maximum branch range. Both range extension thunks and patches are
> +// required.
> +
> +// CHECK1: __AArch64AbsLongThunk_need_thunk_after_patch:
> +// CHECK1-NEXT: 20000: 50 00 00 58 ldr x16, #8
> +// CHECK1-NEXT: 20004: 00 02 1f d6 br x16
> +// CHECK1: $d:
> +// CHECK1-NEXT: 20008: 0c 10 02 08 .word 0x0802100c
> +
> + .section .text.01, "ax", %progbits
> + .balign 4096
> + .globl _start
> + .type _start, %function
> +_start:
> + // Expect thunk on pass 2
> + bl need_thunk_after_patch
> + .section .text.02, "ax", %progbits
> + .space 4096 - 12
> +
> +// CHECK2: _start:
> +// CHECK2-NEXT: 21000: 00 fc ff 97 bl #-4096
> +
> + // Expect patch on pass 1
> + .section .text.03, "ax", %progbits
> + .globl t3_ff8_ldr
> + .type t3_ff8_ldr, %function
> +t3_ff8_ldr:
> + adrp x0, dat
> + ldr x1, [x1, #0]
> + ldr x0, [x0, :got_lo12:dat]
> + ret
> +
> +// CHECK3: t3_ff8_ldr:
> +// CHECK3-NEXT: 21ff8: 60 00 04 f0 adrp x0, #134279168
> +// CHECK3-NEXT: 21ffc: 21 00 40 f9 ldr x1, [x1]
> +// CHECK3-NEXT: 22000: 02 08 80 15 b #100671496
> +// CHECK3-NEXT: 22004: c0 03 5f d6 ret
> +
> + .section .text.04, "ax", %progbits
> + .space 64 * 1024 * 1024
> +
> + // Expect patch on pass 1
> + .section .text.05, "ax", %progbits
> + .balign 4096
> + .space 4096 - 8
> + .globl t3_ff8_str
> + .type t3_ff8_str, %function
> +t3_ff8_str:
> + adrp x0, dat
> + ldr x1, [x1, #0]
> + str x0, [x0, :got_lo12:dat]
> + ret
> +
> +// CHECK4: t3_ff8_str:
> +// CHECK4-NEXT: 4023ff8: 60 00 02 b0 adrp x0, #67162112
> +// CHECK4-NEXT: 4023ffc: 21 00 40 f9 ldr x1, [x1]
> +// CHECK4-NEXT: 4024000: 04 00 80 14 b #33554448
> +// CHECK4-NEXT: 4024004: c0 03 5f d6 ret
> +
> + .section .text.06, "ax", %progbits
> + .space 32 * 1024 * 1024
> +
> +// CHECK5: __CortexA53843419_21000:
> +// CHECK5-NEXT: 6024008: 00 00 40 f9 ldr x0, [x0]
> +// CHECK5-NEXT: 602400c: fe f7 7f 16 b #-100671496
> +// CHECK5: __CortexA53843419_4023000:
> +// CHECK5-NEXT: 6024010: 00 00 00 f9 str x0, [x0]
> +// CHECK5-NEXT: 6024014: fc ff 7f 17 b #-33554448
> +
> + .section .text.07, "ax", %progbits
> + .space (32 * 1024 * 1024) - 12300
> +
> + .section .text.08, "ax", %progbits
> + .globl need_thunk_after_patch
> + .type need_thunk_after_patch, %function
> +need_thunk_after_patch:
> + ret
> +
> +// CHECK6: need_thunk_after_patch:
> +// CHECK6-NEXT: 802100c: c0 03 5f d6 ret
> +
> + // Will need a patch on pass 2
> + .section .text.09, "ax", %progbits
> + .space 4096 - 20
> + .globl t3_ffc_ldr
> + .type t3_ffc_ldr, %function
> +t3_ffc_ldr:
> + adrp x0, dat
> + ldr x1, [x1, #0]
> + ldr x0, [x0, :got_lo12:dat]
> + ret
> +
> +// CHECK7: t3_ffc_ldr:
> +// CHECK7-NEXT: 8021ffc: 60 00 00 f0 adrp x0, #61440
> +// CHECK7-NEXT: 8022000: 21 00 40 f9 ldr x1, [x1]
> +// CHECK7-NEXT: 8022004: 02 00 00 14 b #8
> +// CHECK7-NEXT: 8022008: c0 03 5f d6 ret
> +// CHECK7: __CortexA53843419_8022004:
> +// CHECK7-NEXT: 802200c: 00 00 40 f9 ldr x0, [x0]
> +// CHECK7-NEXT: 8022010: fe ff ff 17 b #-8
> +
> + .section .data
> + .globl dat
> +dat: .quad 0
> Index: test/ELF/aarch64-cortex-a53-843419-address.s
> ===================================================================
> --- test/ELF/aarch64-cortex-a53-843419-address.s
> +++ test/ELF/aarch64-cortex-a53-843419-address.s
> @@ -4,7 +4,8 @@
> // RUN: .text : { *(.text) *(.text.*) *(.newisd) } \
> // RUN: .text2 : { *.(newos) } \
> // RUN: .data : { *(.data) } }" > %t.script
> -// RUN: ld.lld --script %t.script -fix-cortex-a53-843419 -verbose %t.o -o %t2 | FileCheck %s
> +// RUN: ld.lld --script %t.script -fix-cortex-a53-843419 -verbose %t.o -o %t2 | FileCheck -check-prefix=CHECK-PRINT %s
> +// RUN: llvm-objdump -triple=aarch64-linux-gnu -d %t2 | FileCheck %s
>
> // Test cases for Cortex-A53 Erratum 843419 that involve interactions
> // between the generated patches and the address of sections.
> @@ -34,8 +35,12 @@
> // symbols with the same type).
> // - We can ignore erratum sequences in multiple literal data ranges.
>
> -// CHECK: detected cortex-a53-843419 erratum sequence starting at FF8 in unpatched output.
> -
> +// CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at FF8 in unpatched output.
> +// CHECK: t3_ff8_ldr:
> +// CHECK-NEXT: ff8: 20 00 00 d0 adrp x0, #24576
> +// CHECK-NEXT: ffc: 21 00 40 f9 ldr x1, [x1]
> +// CHECK-NEXT: 1000: f9 0f 00 14 b #16356
> +// CHECK-NEXT: 1004: c0 03 5f d6 ret
> .section .text.01, "ax", %progbits
> .balign 4096
> .space 4096 - 8
> @@ -52,7 +57,12 @@
> // every symbol so we need to handle the case of $x $x.
> .local $x.999
> $x.999:
> -// CHECK-NEXT: detected cortex-a53-843419 erratum sequence starting at 1FFC in unpatched output.
> +// CHECK-PRINT-NEXT: detected cortex-a53-843419 erratum sequence starting at 1FFC in unpatched output.
> +// CHECK: t3_ffc_ldrsimd:
> +// CHECK-NEXT: 1ffc: 20 00 00 b0 adrp x0, #20480
> +// CHECK-NEXT: 2000: 21 00 40 bd ldr s1, [x1]
> +// CHECK-NEXT: 2004: fa 0b 00 14 b #12264
> +// CHECK-NEXT: 2008: c0 03 5f d6 ret
> .globl t3_ffc_ldrsimd
> .type t3_ffc_ldrsimd, %function
> .space 4096 - 12
> @@ -84,8 +94,12 @@
> .byte 0xf9
> // Check that we can recognise the erratum sequence post literal data.
>
> -// CHECK-NEXT: detected cortex-a53-843419 erratum sequence starting at 3FF8 in unpatched output.
> -
> +// CHECK-PRINT-NEXT: detected cortex-a53-843419 erratum sequence starting at 3FF8 in unpatched output.
> +// CHECK: t3_ffc_ldr:
> +// CHECK-NEXT: 3ff8: 00 00 00 f0 adrp x0, #12288
> +// CHECK-NEXT: 3ffc: 21 00 40 f9 ldr x1, [x1]
> +// CHECK-NEXT: 4000: fd 03 00 14 b #4084
> +// CHECK-NEXT: 4004: c0 03 5f d6 ret
> .space 4096 - 12
> .globl t3_ffc_ldr
> .type t3_ffc_ldr, %function
> @@ -95,14 +109,29 @@
> ldr x0, [x0, :got_lo12:dat]
> ret
>
> +// CHECK: __CortexA53843419_1000:
> +// CHECK-NEXT: 4fe4: 00 0c 40 f9 ldr x0, [x0, #24]
> +// CHECK-NEXT: 4fe8: 07 f0 ff 17 b #-16356
> +// CHECK: __CortexA53843419_2004:
> +// CHECK-NEXT: 4fec: 02 0c 40 f9 ldr x2, [x0, #24]
> +// CHECK-NEXT: 4ff0: 06 f4 ff 17 b #-12264
> +// CHECK: __CortexA53843419_4000:
> +// CHECK-NEXT: 4ff4: 00 0c 40 f9 ldr x0, [x0, #24]
> +// CHECK-NEXT: 4ff8: 03 fc ff 17 b #-4084
> +
> .section .text.02, "ax", %progbits
> - .space 4096 - 12
> + .space 4096 - 36
>
> // Start a new InputSectionDescription (see Linker Script) so the
> // start address will be affected by any patches added to previous
> // InputSectionDescription.
>
> -// CHECK: detected cortex-a53-843419 erratum sequence starting at 4FFC in unpatched output.
> +// CHECK-PRINT-NEXT: detected cortex-a53-843419 erratum sequence starting at 4FFC in unpatched output
> +// CHECK: t3_ffc_str:
> +// CHECK-NEXT: 4ffc: 00 00 00 d0 adrp x0, #8192
> +// CHECK-NEXT: 5000: 21 00 00 f9 str x1, [x1]
> +// CHECK-NEXT: 5004: fb 03 00 14 b #4076
> +// CHECK-NEXT: 5008: c0 03 5f d6 ret
>
> .section .newisd, "ax", %progbits
> .globl t3_ffc_str
> @@ -112,13 +141,23 @@
> str x1, [x1, #0]
> ldr x0, [x0, :got_lo12:dat]
> ret
> - .space 4096 - 20
> + .space 4096 - 28
>
> -// CHECK: detected cortex-a53-843419 erratum sequence starting at 5FF8 in unpatched output.
> +// CHECK: __CortexA53843419_5004:
> +// CHECK-NEXT: 5ff0: 00 0c 40 f9 ldr x0, [x0, #24]
> +// CHECK-NEXT: 5ff4: 05 fc ff 17 b #-4076
>
> // Start a new OutputSection (see Linker Script) so the
> // start address will be affected by any patches added to previous
> // InputSectionDescription.
> +
> +//CHECK-PRINT-NEXT: detected cortex-a53-843419 erratum sequence starting at 5FF8 in unpatched output
> +// CHECK: t3_ff8_str:
> +// CHECK-NEXT: 5ff8: 00 00 00 b0 adrp x0, #4096
> +// CHECK-NEXT: 5ffc: 21 00 00 f9 str x1, [x1]
> +// CHECK-NEXT: 6000: 03 00 00 14 b #12
> +// CHECK-NEXT: 6004: c0 03 5f d6 ret
> +
> .section .newos, "ax", %progbits
> .globl t3_ff8_str
> .type t3_ff8_str, %function
> @@ -132,6 +171,10 @@
> _start:
> ret
>
> +// CHECK: __CortexA53843419_6000:
> +// CHECK-NEXT: 600c: 00 0c 40 f9 ldr x0, [x0, #24]
> +// CHECK-NEXT: 6010: fd ff ff 17 b #-12
> +
> .data
> .globl dat
> dat: .word 0
> Index: ELF/Writer.cpp
> ===================================================================
> --- ELF/Writer.cpp
> +++ ELF/Writer.cpp
> @@ -1362,6 +1362,7 @@
> // alter InputSection addresses we must converge to a fixed point.
> if (Target->NeedsThunks || Config->AndroidPackDynRelocs) {
> ThunkCreator TC;
> + AArch64Err843419Patcher A64P;
> bool Changed;
> do {
> Script->assignAddresses();
> @@ -1371,7 +1372,7 @@
> if (Config->FixCortexA53Errata843419) {
> if (Changed)
> Script->assignAddresses();
> - reportA53Errata843419Fixes();
> + Changed |= A64P.createFixes();
> }
> if (InX::MipsGot)
> InX::MipsGot->updateAllocSize();
> Index: ELF/AArch64ErrataFix.h
> ===================================================================
> --- ELF/AArch64ErrataFix.h
> +++ ELF/AArch64ErrataFix.h
> @@ -12,11 +12,39 @@
>
> #include "lld/Common/LLVM.h"
>
> +#include <map>
> +#include <vector>
> +
> namespace lld {
> namespace elf {
>
> +class Defined;
> +class InputSection;
> +class InputSectionDescription;
> class OutputSection;
> -void reportA53Errata843419Fixes();
> +class Patch843419Section;
> +
> +class AArch64Err843419Patcher {
> +public:
> + // return true if Patches have been added to the OutputSections.
> + bool createFixes();
> +
> +private:
> + std::vector<Patch843419Section *>
> + patchInputSectionDescription(InputSectionDescription &ISD);
> +
> + void insertPatches(InputSectionDescription &ISD,
> + std::vector<Patch843419Section *> &Patches);
> +
> + void init();
> +
> + // A cache of the mapping symbols defined by the InputSecion sorted in order
> + // of ascending value with redundant symbols removed. These describe
> + // the ranges of code and data in an executable InputSection.
> + std::map<InputSection *, std::vector<const Defined *>> SectionMap;
> +
> + bool Initialized = false;
> +};
>
> } // namespace elf
> } // namespace lld
> Index: ELF/AArch64ErrataFix.cpp
> ===================================================================
> --- ELF/AArch64ErrataFix.cpp
> +++ ELF/AArch64ErrataFix.cpp
> @@ -23,9 +23,6 @@
> // - We can place the replacement sequence within range of the branch.
>
> // FIXME:
> -// - At this stage the implementation only supports detection and not fixing,
> -// this is sufficient to test the decode and recognition of the erratum
> -// sequence.
> // - The implementation here only supports one patch, the AArch64 Cortex-53
> // errata 843419 that affects r0p0, r0p1, r0p2 and r0p4 versions of the core.
> // To keep the initial version simple there is no support for multiple
> @@ -336,13 +333,6 @@
> isLoadStoreRegisterUnsigned(Instr4) && getRn(Instr4) == Rn;
> }
>
> -static void report843419Fix(uint64_t AdrpAddr) {
> - if (!Config->Verbose)
> - return;
> - message("detected cortex-a53-843419 erratum sequence starting at " +
> - utohexstr(AdrpAddr) + " in unpatched output.");
> -}
> -
> // Scan the instruction sequence starting at Offset Off from the base of
> // InputSection IS. We update Off in this function rather than in the caller as
> // we can skip ahead much further into the section when we know how many
> @@ -385,16 +375,66 @@
> return PatchOff;
> }
>
> -// The AArch64 ABI permits data in executable sections. We must avoid scanning
> -// this data as if it were instructions to avoid false matches.
> -// The ABI Section 4.5.4 Mapping symbols; defines local symbols that describe
> -// half open intervals [Symbol Value, Next Symbol Value) of code and data
> -// within sections. If there is no next symbol then the half open interval is
> -// [Symbol Value, End of section). The type, code or data, is determined by the
> -// mapping symbol name, $x for code, $d for data.
> -std::map<InputSection *,
> - std::vector<const Defined *>> static makeAArch64SectionMap() {
> - std::map<InputSection *, std::vector<const Defined *>> SectionMap;
> +class lld::elf::Patch843419Section : public SyntheticSection {
> +public:
> + Patch843419Section(InputSection *P, uint64_t Off);
> +
> + void writeTo(uint8_t *Buf) override;
> +
> + size_t getSize() const override { return 8; }
> +
> + uint64_t getLDSTAddr() const;
> +
> + // The Section we are patching.
> + const InputSection *Patchee;
> + // The offset of the instruction in the Patchee section we are patching.
> + uint64_t PatcheeOffset;
> + // A label for the start of the Patch that we can use as a relocation target.
> + Symbol *PatchSym;
> +};
> +
> +lld::elf::Patch843419Section::Patch843419Section(InputSection *P, uint64_t Off)
> + : SyntheticSection(SHF_ALLOC | SHF_EXECINSTR, SHT_PROGBITS, 4,
> + ".text.patch"),
> + Patchee(P), PatcheeOffset(Off) {
> + this->Parent = P->getParent();
> + PatchSym = addSyntheticLocal(
> + Saver.save("__CortexA53843419_" + utohexstr(getLDSTAddr())), STT_FUNC, 0,
> + getSize(), this);
> + addSyntheticLocal(Saver.save("$x"), STT_NOTYPE, 0, 0, this);
> +}
> +
> +uint64_t lld::elf::Patch843419Section::getLDSTAddr() const {
> + return Patchee->getParent()->Addr + Patchee->OutSecOff + PatcheeOffset;
> +}
> +
> +void lld::elf::Patch843419Section::writeTo(uint8_t *Buf) {
> + // Copy the instruction that we will be replacing with a branch in the
> + // Patchee Section.
> + write32le(Buf, read32le(Patchee->Data.begin() + PatcheeOffset));
> +
> + // Apply any relocation transferred from the original PatcheeSection.
> + // For a SyntheticSection Buf already has OutSecOff added, but relocateAlloc
> + // also adds OutSecOff so we need to subtract to avoid double counting.
> + this->relocateAlloc(Buf - OutSecOff, Buf - OutSecOff + getSize());
> +
> + // Return address is the next instruction after the one we have just copied.
> + uint64_t S = getLDSTAddr() + 4;
> + uint64_t P = PatchSym->getVA() + 4;
> + Target->relocateOne(Buf + 4, R_AARCH64_JUMP26, S - P);
> +}
> +
> +void AArch64Err843419Patcher::init() {
> + // The AArch64 ABI permits data in executable sections. We must avoid scanning
> + // this data as if it were instructions to avoid false matches. We use the
> + // mapping symbols in the InputObjects to identify this data, caching the
> + // results in SectionMap so we don't have to recalculate it each pass.
> +
> + // The ABI Section 4.5.4 Mapping symbols; defines local symbols that describe
> + // half open intervals [Symbol Value, Next Symbol Value) of code and data
> + // within sections. If there is no next symbol then the half open interval is
> + // [Symbol Value, End of section). The type, code or data, is determined by
> + // the mapping symbol name, $x for code, $d for data.
> auto IsCodeMapSymbol = [](const Symbol *B) {
> return B->getName() == "$x" || B->getName().startswith("$x.");
> };
> @@ -435,56 +475,174 @@
> }),
> MapSyms.end());
> }
> - return SectionMap;
> -}
> -
> -static void scanInputSectionDescription(std::vector<const Defined *> &MapSyms,
> - InputSection *IS) {
> - // Use SectionMap to make sure we only scan code and not inline data.
> - // We have already sorted MapSyms in ascending order and removed
> - // consecutive mapping symbols of the same type. Our range of
> - // executable instructions to scan is therefore [CodeSym->Value,
> - // DataSym->Value) or [CodeSym->Value, section size).
> - auto CodeSym = llvm::find_if(MapSyms, [&](const Defined *MS) {
> - return MS->getName().startswith("$x");
> - });
> -
> - while (CodeSym != MapSyms.end()) {
> - auto DataSym = std::next(CodeSym);
> - uint64_t Off = (*CodeSym)->Value;
> - uint64_t Limit =
> - (DataSym == MapSyms.end()) ? IS->Data.size() : (*DataSym)->Value;
> -
> - while (Off < Limit) {
> - uint64_t StartAddr = IS->getParent()->Addr + IS->OutSecOff + Off;
> - if (scanCortexA53Errata843419(IS, Off, Limit))
> - report843419Fix(StartAddr);
> + Initialized = true;
> +}
> +
> +// Insert the PatchSections we have created back into the
> +// InputSectionDescription. As inserting patches alters the addresses of
> +// InputSections that follow them, we try and place the patches after all the
> +// executable sections, although we may need to insert them earlier if the
> +// InputSectionDescription is larger than the maximum branch range.
> +void AArch64Err843419Patcher::insertPatches(
> + InputSectionDescription &ISD, std::vector<Patch843419Section *> &Patches) {
> + uint64_t ISLimit;
> + uint64_t PrevISLimit = ISD.Sections.front()->OutSecOff;
> + uint64_t PatchUpperBound = PrevISLimit + Target->ThunkSectionSpacing;
> +
> + // Set the OutSecOff of patches to the place where we want to insert them.
> + // We use a similar strategy to Thunk placement. Place patches roughly
> + // every multiple of maximum branch range.
> + auto PatchIt = Patches.begin();
> + auto PatchEnd = Patches.end();
> + for (const InputSection *IS : ISD.Sections) {
> + ISLimit = IS->OutSecOff + IS->getSize();
> + if (ISLimit > PatchUpperBound) {
> + while (PatchIt != PatchEnd) {
> + if ((*PatchIt)->getLDSTAddr() >= PrevISLimit)
> + break;
> + (*PatchIt)->OutSecOff = PrevISLimit;
> + ++PatchIt;
> + }
> + PatchUpperBound = PrevISLimit + Target->ThunkSectionSpacing;
> }
> - if (DataSym == MapSyms.end())
> - break;
> - CodeSym = std::next(DataSym);
> + PrevISLimit = ISLimit;
> + }
> + for (; PatchIt != PatchEnd; ++PatchIt) {
> + (*PatchIt)->OutSecOff = ISLimit;
> }
> -}
>
> -// Scan all the executable code in an AArch64 link to detect the Cortex-A53
> -// erratum 843419.
> -// FIXME: The current implementation only scans for the erratum sequence, it
> -// does not attempt to fix it.
> -void lld::elf::reportA53Errata843419Fixes() {
> - std::map<InputSection *, std::vector<const Defined *>> SectionMap =
> - makeAArch64SectionMap();
> + // merge all patch sections. We use the OutSecOff assigned above to
> + // determine the insertion point. This is ok as we only merge into an
> + // InputSectionDescription once per pass, and at the end of the pass
> + // assignAddresses() will recalculate all the OutSecOff values.
> + std::vector<InputSection *> Tmp;
> + Tmp.reserve(ISD.Sections.size() + Patches.size());
> + auto MergeCmp = [](const InputSection *A, const InputSection *B) {
> + if (A->OutSecOff < B->OutSecOff)
> + return true;
> + if (A->OutSecOff == B->OutSecOff && isa<Patch843419Section>(A) &&
> + !isa<Patch843419Section>(B))
> + return true;
> + return false;
> + };
> + std::merge(ISD.Sections.begin(), ISD.Sections.end(), Patches.begin(),
> + Patches.end(), std::back_inserter(Tmp), MergeCmp);
> + ISD.Sections = std::move(Tmp);
> +}
> +
> +// Given an erratum sequence that starts at address AdrpAddr, with an
> +// instruction that we need to patch at PatcheeOffset from the start of
> +// InputSection IS, create a Patch843419 Section and add it to the
> +// Patches that we need to insert.
> +static void implementPatch(uint64_t AdrpAddr, uint64_t PatcheeOffset,
> + InputSection *IS,
> + std::vector<Patch843419Section *> &Patches) {
> + // There may be a relocation at the same offset that we are patching. There
> + // are three cases that we need to consider.
> + // Case 1: R_AARCH64_JUMP26 branch relocation. We have already patched this
> + // instance of the erratum on a previous patch and altered the relocation. We
> + // have nothing more to do.
> + // Case 2: A load/store register (unsigned immediate) class relocation. There
> + // are two of these R_AARCH_LD64_ABS_LO12_NC and R_AARCH_LD64_GOT_LO12_NC and
> + // they are both absolute. We need to add the same relocation to the patch,
> + // and replace the relocation with a R_AARCH_JUMP26 branch relocation.
> + // Case 3: No relocation. We must create a new R_AARCH64_JUMP26 branch
> + // relocation at the offset.
> + auto RelIt = std::find_if(
> + IS->Relocations.begin(), IS->Relocations.end(),
> + [=](const Relocation &R) { return R.Offset == PatcheeOffset; });
> + if (RelIt != IS->Relocations.end() && RelIt->Type == R_AARCH64_JUMP26)
> + return;
> +
> + if (Config->Verbose)
> + message("detected cortex-a53-843419 erratum sequence starting at " +
> + utohexstr(AdrpAddr) + " in unpatched output.");
> +
> + auto *PS = make<Patch843419Section>(IS, PatcheeOffset);
> + Patches.push_back(PS);
> +
> + auto MakeRelToPatch = [](uint64_t Offset, Symbol *PatchSym) {
> + return Relocation{R_PC, R_AARCH64_JUMP26, Offset, 0, PatchSym};
> + };
>
> + if (RelIt != IS->Relocations.end()) {
> + PS->Relocations.push_back(
> + {RelIt->Expr, RelIt->Type, 0, RelIt->Addend, RelIt->Sym});
> + *RelIt = MakeRelToPatch(PatcheeOffset, PS->PatchSym);
> + } else
> + IS->Relocations.push_back(MakeRelToPatch(PatcheeOffset, PS->PatchSym));
> +}
> +
> +// Scan all the instructions in InputSectionDescription, for each instance of
> +// the erratum sequence create a Patch843419Section. We return the list of
> +// Patch843419Sections that need to be applied to ISD.
> +std::vector<Patch843419Section *>
> +AArch64Err843419Patcher::patchInputSectionDescription(
> + InputSectionDescription &ISD) {
> + std::vector<Patch843419Section *> Patches;
> + for (InputSection *IS : ISD.Sections) {
> + // LLD doesn't use the erratum sequence in SyntheticSections.
> + if (isa<SyntheticSection>(IS))
> + continue;
> + // Use SectionMap to make sure we only scan code and not inline data.
> + // We have already sorted MapSyms in ascending order and removed consecutive
> + // mapping symbols of the same type. Our range of executable instructions to
> + // scan is therefore [CodeSym->Value, DataSym->Value) or [CodeSym->Value,
> + // section size).
> + std::vector<const Defined *> &MapSyms = SectionMap[IS];
> +
> + auto CodeSym = llvm::find_if(MapSyms, [&](const Defined *MS) {
> + return MS->getName().startswith("$x");
> + });
> +
> + while (CodeSym != MapSyms.end()) {
> + auto DataSym = std::next(CodeSym);
> + uint64_t Off = (*CodeSym)->Value;
> + uint64_t Limit =
> + (DataSym == MapSyms.end()) ? IS->Data.size() : (*DataSym)->Value;
> +
> + while (Off < Limit) {
> + uint64_t StartAddr = IS->getParent()->Addr + IS->OutSecOff + Off;
> + if (uint64_t PatcheeOffset = scanCortexA53Errata843419(IS, Off, Limit))
> + implementPatch(StartAddr, PatcheeOffset, IS, Patches);
> + }
> + if (DataSym == MapSyms.end())
> + break;
> + CodeSym = std::next(DataSym);
> + }
> + }
> + return Patches;
> +}
> +
> +// For each InputSectionDescription make one pass over the executable sections
> +// looking for the erratum sequence; creating a synthetic Patch843419Section
> +// for each instance found. We insert these synthetic patch sections after the
> +// executable code in each InputSectionDescription.
> +//
> +// PreConditions:
> +// The Output and Input Sections have had their final addresses assigned.
> +//
> +// PostConditions:
> +// Returns true if at least one patch was added. The addresses of the
> +// Ouptut and Input Sections may have been changed.
> +// Returns false if no patches were required and no changes were made.
> +bool AArch64Err843419Patcher::createFixes() {
> + if (Initialized == false)
> + init();
> +
> + bool AddressesChanged = false;
> for (OutputSection *OS : OutputSections) {
> if (!(OS->Flags & SHF_ALLOC) || !(OS->Flags & SHF_EXECINSTR))
> continue;
> for (BaseCommand *BC : OS->SectionCommands)
> if (auto *ISD = dyn_cast<InputSectionDescription>(BC)) {
> - for (InputSection *IS : ISD->Sections) {
> - // LLD doesn't use the erratum sequence in SyntheticSections.
> - if (isa<SyntheticSection>(IS))
> - continue;
> - scanInputSectionDescription(SectionMap[IS], IS);
> + std::vector<Patch843419Section *> Patches =
> + patchInputSectionDescription(*ISD);
> + if (!Patches.empty()) {
> + insertPatches(*ISD, Patches);
> + AddressesChanged = true;
> }
> }
> }
> + return AddressesChanged;
> }
More information about the llvm-commits
mailing list