[llvm-branch-commits] [lld] 852b37f - [LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS.
Hans Wennborg via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Tue Feb 4 02:10:28 PST 2020
Author: Peter Smith
Date: 2020-02-04T11:08:31+01:00
New Revision: 852b37f83b2dd31ff4d708c2a789857418171f93
URL: https://github.com/llvm/llvm-project/commit/852b37f83b2dd31ff4d708c2a789857418171f93
DIFF: https://github.com/llvm/llvm-project/commit/852b37f83b2dd31ff4d708c2a789857418171f93.diff
LOG: [LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS.
In D71281 a fix was put in to round up the size of a ThunkSection to the
nearest 4KiB when performing errata patching. This fixed a problem with a
very large instrumented program that had thunks and patches mutually
trigger each other. Unfortunately it triggers an assertion failure in an
AArch64 allyesconfig build of the kernel. There is a specific assertion
preventing an InputSectionDescription being larger than 4KiB. This will
always trigger if there is at least one Thunk needed in that
InputSectionDescription, which is possible for an allyesconfig build.
Abstractly the problem case is:
.text : {
*(.text) ;
...
. = ALIGN(SZ_4K);
__idmap_text_start = .;
*(.idmap.text)
__idmap_text_end = .;
...
}
The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB.
Note that there is more than one InputSectionDescription in the
OutputSection so we can't just restrict the fix to OutputSections smaller
than 4 KiB.
The fix presented here limits the D71281 to InputSectionDescriptions that
meet the following conditions:
1.) The OutputSection is bigger than the thunkSectionSpacing so adding
thunks will affect the addresses of following code.
2.) The InputSectionDescription is larger than 4 KiB. This will prevent
any assertion failures that an InputSectionDescription is < 4 KiB
in size.
We do this at ThunkSection creation time as at this point we know that
the addresses are stable and up to date prior to adding the thunks as
assignAddresses() will have been called immediately prior to thunk
generation.
The fix reverts the two tests affected by D71281 to their original state
as they no longer need the 4KiB size roundup. I've added simpler tests to
check for D71281 when the OutputSection size is larger than the ThunkSection
spacing.
Fixes https://github.com/ClangBuiltLinux/linux/issues/812
Differential Revision: https://reviews.llvm.org/D72344
(cherry picked from commit 01ad4c838466bd5db180608050ed8ccb3b62d136)
Added:
lld/test/ELF/aarch64-cortex-a53-843419-thunk-align.s
lld/test/ELF/arm-fix-cortex-a8-thunk-align.s
Modified:
lld/ELF/Relocations.cpp
lld/ELF/SyntheticSections.cpp
lld/ELF/SyntheticSections.h
lld/test/ELF/aarch64-cortex-a53-843419-thunk.s
lld/test/ELF/arm-fix-cortex-a8-thunk.s
Removed:
################################################################################
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 18032b9c2cf8..93ec06610716 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -1758,6 +1758,37 @@ ThunkSection *ThunkCreator::addThunkSection(OutputSection *os,
uint64_t off) {
auto *ts = make<ThunkSection>(os, off);
ts->partition = os->partition;
+ if ((config->fixCortexA53Errata843419 || config->fixCortexA8) &&
+ !isd->sections.empty()) {
+ // The errata fixes are sensitive to addresses modulo 4 KiB. When we add
+ // thunks we disturb the base addresses of sections placed after the thunks
+ // this makes patches we have generated redundant, and may cause us to
+ // generate more patches as
diff erent instructions are now in sensitive
+ // locations. When we generate more patches we may force more branches to
+ // go out of range, causing more thunks to be generated. In pathological
+ // cases this can cause the address dependent content pass not to converge.
+ // We fix this by rounding up the size of the ThunkSection to 4KiB, this
+ // limits the insertion of a ThunkSection on the addresses modulo 4 KiB,
+ // which means that adding Thunks to the section does not invalidate
+ // errata patches for following code.
+ // Rounding up the size to 4KiB has consequences for code-size and can
+ // trip up linker script defined assertions. For example the linux kernel
+ // has an assertion that what LLD represents as an InputSectionDescription
+ // does not exceed 4 KiB even if the overall OutputSection is > 128 Mib.
+ // We use the heuristic of rounding up the size when both of the following
+ // conditions are true:
+ // 1.) The OutputSection is larger than the ThunkSectionSpacing. This
+ // accounts for the case where no single InputSectionDescription is
+ // larger than the OutputSection size. This is conservative but simple.
+ // 2.) The InputSectionDescription is larger than 4 KiB. This will prevent
+ // any assertion failures that an InputSectionDescription is < 4 KiB
+ // in size.
+ uint64_t isdSize = isd->sections.back()->outSecOff +
+ isd->sections.back()->getSize() -
+ isd->sections.front()->outSecOff;
+ if (os->size > target->getThunkSectionSpacing() && isdSize > 4096)
+ ts->roundUpSizeForErrata = true;
+ }
isd->thunkSections.push_back({ts, pass});
return ts;
}
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index cb1706dfdde8..ea6eab4b47ad 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -3460,13 +3460,8 @@ ThunkSection::ThunkSection(OutputSection *os, uint64_t off)
this->outSecOff = off;
}
-// When the errata patching is on, we round the size up to a 4 KiB
-// boundary. This limits the effect that adding Thunks has on the addresses
-// of the program modulo 4 KiB. As the errata patching is sensitive to address
-// modulo 4 KiB this can prevent further patches from being needed due to
-// Thunk insertion.
size_t ThunkSection::getSize() const {
- if (config->fixCortexA53Errata843419 || config->fixCortexA8)
+ if (roundUpSizeForErrata)
return alignTo(size, 4096);
return size;
}
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index 65f9aabdc13f..5f59178fb541 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -1069,6 +1069,10 @@ class ThunkSection : public SyntheticSection {
InputSection *getTargetInputSection() const;
bool assignOffsets();
+ // When true, round up reported size of section to 4 KiB. See comment
+ // in addThunkSection() for more details.
+ bool roundUpSizeForErrata = false;
+
private:
std::vector<Thunk *> thunks;
size_t size = 0;
diff --git a/lld/test/ELF/aarch64-cortex-a53-843419-thunk-align.s b/lld/test/ELF/aarch64-cortex-a53-843419-thunk-align.s
new file mode 100644
index 000000000000..a410233dcdfb
--- /dev/null
+++ b/lld/test/ELF/aarch64-cortex-a53-843419-thunk-align.s
@@ -0,0 +1,74 @@
+// REQUIRES: aarch64
+// RUN: llvm-mc -filetype=obj -triple=aarch64-none-linux %s -o %t.o
+// RUN: echo "SECTIONS { \
+// RUN: .text 0x10000 : { \
+// RUN: *(.text.01) ; \
+// RUN: . += 0x8000000 ; \
+// RUN: *(.text.02) } \
+// RUN: .foo : { *(.foo_sec) } } " > %t.script
+// RUN: ld.lld -pie --fix-cortex-a53-843419 --script=%t.script %t.o -o %t2
+// RUN: llvm-objdump --no-show-raw-insn -triple=aarch64-linux-gnu -d %t2
+
+
+/// %t2 is > 128 Megabytes, so delete it early.
+// RUN: rm %t2
+
+/// Test case that for an OutputSection larger than the ThunkSectionSpacing
+/// --fix-cortex-a53-843419 will cause the size of the ThunkSection to be
+/// rounded up to the nearest 4KiB
+
+ .section .text.01, "ax", %progbits
+ .balign 4096
+ .globl _start
+ .type _start, %function
+_start:
+/// Range extension thunk needed, due to linker script
+ bl far_away
+ .space 4096 - 12
+
+/// Erratum sequence
+ .globl t3_ff8_ldr
+ .type t3_ff8_ldr, %function
+t3_ff8_ldr:
+ adrp x0, dat
+ ldr x1, [x1, #0]
+ ldr x0, [x0, :lo12:dat]
+ ret
+
+/// Expect thunk and patch to be inserted here
+// CHECK: 0000000000011008 __AArch64ADRPThunk_far_away:
+// CHECK-NEXT: 11008: adrp x16, #134221824
+// CHECK-NEXT: add x16, x16, #16
+// CHECK-NEXT: br x16
+// CHECK: 0000000000012008 __CortexA53843419_11000:
+// CHECK-NEXT: 12008: ldr x0, [x0, #168]
+// CHECK-NEXT: b #-4104 <t3_ff8_ldr+0xc>
+
+ .section .text.02, "ax", %progbits
+ .globl far_away
+ .type far_away, function
+far_away:
+ bl _start
+ ret
+/// Expect thunk for _start not to have size rounded up to 4KiB as it is at
+/// the end of the OutputSection
+// CHECK: 0000000008012010 far_away:
+// CHECK-NEXT: 8012010: bl #8
+// CHECK-NEXT: ret
+// CHECK: 0000000008012018 __AArch64ADRPThunk__start:
+// CHECK-NEXT: 8012018: adrp x16, #-134225920
+// CHECK-NEXT: add x16, x16, #0
+// CHECK-NEXT: br x16
+// CHECK: 0000000008012024 foo:
+// CHECK-NEXT: 8012024: ret
+ .section .foo_sec, "ax", %progbits
+ .globl foo
+ .type foo, function
+foo:
+ ret
+
+
+ .section .data
+ .balign 8
+ .globl dat
+dat: .quad 0
diff --git a/lld/test/ELF/aarch64-cortex-a53-843419-thunk.s b/lld/test/ELF/aarch64-cortex-a53-843419-thunk.s
index 7330296ac08e..2242757d09b4 100644
--- a/lld/test/ELF/aarch64-cortex-a53-843419-thunk.s
+++ b/lld/test/ELF/aarch64-cortex-a53-843419-thunk.s
@@ -5,7 +5,6 @@
// RUN: .text2 0x8010000 : { *(.text.04) } } " > %t.script
// RUN: ld.lld --script %t.script -fix-cortex-a53-843419 -verbose %t.o -o %t2 \
// RUN: 2>&1 | FileCheck -check-prefix=CHECK-PRINT %s
-
// RUN: llvm-objdump --no-show-raw-insn -triple=aarch64-linux-gnu -d %t2 | FileCheck %s
/// %t2 is 128 Megabytes, so delete it early.
@@ -23,11 +22,9 @@
_start:
bl far_away
/// Thunk to far_away, size 16-bytes goes here.
- /// Thunk Section with patch enabled has its size rounded up to 4KiB
- /// this leaves the address of following sections the same modulo 4 KiB
.section .text.02, "ax", %progbits
- .space 4096 - 12
+ .space 4096 - 28
/// Erratum sequence will only line up at address 0 modulo 0xffc when
/// Thunk is inserted.
@@ -40,13 +37,13 @@ t3_ff8_ldr:
ldr x0, [x0, :got_lo12:dat]
ret
-// CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 11FF8 in unpatched output.
-// CHECK: 0000000000011ff8 t3_ff8_ldr:
-// CHECK-NEXT: adrp x0, #134213632
+// CHECK-PRINT: detected cortex-a53-843419 erratum sequence starting at 10FF8 in unpatched output.
+// CHECK: 0000000000010ff8 t3_ff8_ldr:
+// CHECK-NEXT: adrp x0, #134217728
// CHECK-NEXT: ldr x1, [x1]
// CHECK-NEXT: b #8
// CHECK-NEXT: ret
-// CHECK: 0000000000012008 __CortexA53843419_12000:
+// CHECK: 0000000000011008 __CortexA53843419_11000:
// CHECK-NEXT: ldr x0, [x0, #8]
// CHECK-NEXT: b #-8
.section .text.04, "ax", %progbits
diff --git a/lld/test/ELF/arm-fix-cortex-a8-thunk-align.s b/lld/test/ELF/arm-fix-cortex-a8-thunk-align.s
new file mode 100644
index 000000000000..49b95d503c57
--- /dev/null
+++ b/lld/test/ELF/arm-fix-cortex-a8-thunk-align.s
@@ -0,0 +1,41 @@
+// REQUIRES: arm
+// RUN: llvm-mc -filetype=obj -triple=armv7a-linux-gnueabihf --arm-add-build-attributes %s -o %t.o
+// RUN: ld.lld --fix-cortex-a8 --shared %t.o -o %t2
+// RUN: llvm-objdump -d --no-show-raw-insn %t2 | FileCheck %s
+
+/// Test case that for an OutputSection larger than the ThunkSectionSpacing
+/// --fix-cortex-a8 will cause the size of the ThunkSection to be rounded up to
+/// the nearest 4KiB
+ .thumb
+
+ .section .text.01, "ax", %progbits
+ .balign 4096
+ .globl _start
+ .type _start, %function
+_start:
+ /// state change thunk required
+ b.w arm_func
+thumb_target:
+ .space 4096 - 10
+ /// erratum patch needed
+ nop.w
+ b.w thumb_target
+
+/// Expect thunk and patch to be inserted here
+// CHECK: 00003004 __ThumbV7PILongThunk_arm_func:
+// CHECK-NEXT: 3004: movw r12, #4088
+// CHECK-NEXT: movt r12, #256
+// CHECK-NEXT: add r12, pc
+// CHECK-NEXT: bx r12
+// CHECK: 00004004 __CortexA8657417_2FFE:
+// CHECK-NEXT: 4004: b.w #-8196
+ .section .text.02
+ /// Take us over thunk section spacing
+ .space 16 * 1024 * 1024
+
+ .section .text.03, "ax", %progbits
+ .arm
+ .balign 4
+ .type arm_func, %function
+arm_func:
+ bx lr
diff --git a/lld/test/ELF/arm-fix-cortex-a8-thunk.s b/lld/test/ELF/arm-fix-cortex-a8-thunk.s
index c1efb5bd2481..544e82cb0489 100644
--- a/lld/test/ELF/arm-fix-cortex-a8-thunk.s
+++ b/lld/test/ELF/arm-fix-cortex-a8-thunk.s
@@ -1,7 +1,7 @@
// REQUIRES: arm
// RUN: llvm-mc -filetype=obj -triple=armv7a-linux-gnueabihf --arm-add-build-attributes %s -o %t.o
// RUN: echo "SECTIONS { \
-// RUN: .text0 0x01200a : { *(.text.00) } \
+// RUN: .text0 0x011006 : { *(.text.00) } \
// RUN: .text1 0x110000 : { *(.text.01) *(.text.02) *(.text.03) \
// RUN: *(.text.04) } \
// RUN: .text2 0x210000 : { *(.text.05) } } " > %t.script
@@ -32,7 +32,7 @@ _start:
// CHECK-NEXT: bx r12
.section .text.02, "ax", %progbits
- .space 4096 - 10
+ .space 4096 - 22
.section .text.03, "ax", %progbits
.thumb_func
@@ -43,21 +43,21 @@ target:
bl target
/// Expect erratum patch inserted here
-// CHECK: 00111ffa target:
-// CHECK-NEXT: 111ffa: nop.w
+// CHECK: 00110ffa target:
+// CHECK-NEXT: 110ffa: nop.w
// CHECK-NEXT: bl #2
-// CHECK: 00112004 __CortexA8657417_111FFE:
-// CHECK-NEXT: 112004: b.w #-14
+// CHECK: 00111004 __CortexA8657417_110FFE:
+// CHECK-NEXT: 111004: b.w #-14
/// Expect range extension thunk here.
-// CHECK: 00112008 __ThumbV7PILongThunk_early:
-// CHECK-NEXT: 112008: b.w #-1048578
+// CHECK: 00111008 __ThumbV7PILongThunk_early:
+// CHECK-NEXT: 111008: b.w #-1048582
.section .text.04, "ax", %progbits
/// The erratum patch will push this branch out of range, so another
/// range extension thunk will be needed.
beq.w early
-// CHECK: 113008: beq.w #-4100
+// CHECK: 11100c: beq.w #-8
.section .text.05, "ax", %progbits
.arm
More information about the llvm-branch-commits
mailing list