[llvm] 7bc03f5 - [MachineOutliner][AArch64] WA for multiple stack fixup cases in MachineOutliner.
Puyan Lotfi via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 10 12:44:36 PDT 2020
Author: Puyan Lotfi
Date: 2020-08-10T15:43:30-04:00
New Revision: 7bc03f55539f7f081daea5363f2e4845b2e75f57
URL: https://github.com/llvm/llvm-project/commit/7bc03f55539f7f081daea5363f2e4845b2e75f57
DIFF: https://github.com/llvm/llvm-project/commit/7bc03f55539f7f081daea5363f2e4845b2e75f57.diff
LOG: [MachineOutliner][AArch64] WA for multiple stack fixup cases in MachineOutliner.
In cases where MachineOutliner candidates either are:
* noreturn
* have calls with no available LR or free regs
* Don't use SP
we can end up hitting stack fixup code for the caller and the callee for
a FrameID of MachineOutlinerDefault. This triggers the assert:
`assert(OF.FrameConstructionID != MachineOutlinerDefault &&
"Can only fix up stack references once");`
in AArch64InstrInfo.cpp. This assert exists for now because a lot of the
fixup code is not tested to handle fixing up more than once and needs
some better checks and enhancements to avoid potentially generating
illegal code.
I've filed a Bugzilla report to track this until these cases are handled
by the AArch64 MachineOutliner: https://bugs.llvm.org/show_bug.cgi?id=46767
This diff detects cases that will cause these multiple stack fixups and
prune the Candidates from `RepeatedSequenceLocs`.
Differential Revision: https://reviews.llvm.org/D83923
Added:
llvm/test/CodeGen/AArch64/machine-outliner-2fixup-blr-terminator.mir
llvm/test/CodeGen/AArch64/machine-outliner-no-noreturn-no-stack.mir
llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
Modified:
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index b6fda6b367bf..86aacedebdfe 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -6193,6 +6193,60 @@ outliner::OutlinedFunction AArch64InstrInfo::getOutliningCandidateInfo(
FrameID = MachineOutlinerNoLRSave;
} else {
SetCandidateCallInfo(MachineOutlinerDefault, 12);
+
+ // Bugzilla ID: 46767
+ // TODO: Check if fixing up the stack more than once is safe so we can
+ // outline these.
+ //
+ // An outline resulting in a caller that requires stack fixups at the
+ // callsite to a callee that also requires stack fixups can happen when
+ // there are no available registers at the candidate callsite for a
+ // candidate that itself also has calls.
+ //
+ // In other words if function_containing_sequence in the following pseudo
+ // assembly requires that we save LR at the point of the call, but there
+ // are no available registers: in this case we save using SP and as a
+ // result the SP offsets requires stack fixups by multiples of 16.
+ //
+ // function_containing_sequence:
+ // ...
+ // save LR to SP <- Requires stack instr fixups in OUTLINED_FUNCTION_N
+ // call OUTLINED_FUNCTION_N
+ // restore LR from SP
+ // ...
+ //
+ // OUTLINED_FUNCTION_N:
+ // save LR to SP <- Requires stack instr fixups in OUTLINED_FUNCTION_N
+ // ...
+ // bl foo
+ // restore LR from SP
+ // ret
+ //
+ // Because the code to handle more than one stack fixup does not
+ // currently have the proper checks for legality, these cases will assert
+ // in the AArch64 MachineOutliner. This is because the code to do this
+ // needs more hardening, testing, better checks that generated code is
+ // legal, etc and because it is only verified to handle a single pass of
+ // stack fixup.
+ //
+ // The assert happens in AArch64InstrInfo::buildOutlinedFrame to catch
+ // these cases until they are known to be handled. Bugzilla 46767 is
+ // referenced in comments at the assert site.
+ //
+ // To avoid asserting (or generating non-legal code on noassert builds)
+ // we remove all candidates which would need more than one stack fixup by
+ // pruning the cases where the candidate has calls while also having no
+ // available LR and having no available general purpose registers to copy
+ // LR to (ie one extra stack save/restore).
+ //
+ if (FlagsSetInAll & MachineOutlinerMBBFlags::HasCalls) {
+ erase_if(RepeatedSequenceLocs, [this](outliner::Candidate &C) {
+ return (std::any_of(
+ C.front(), std::next(C.back()),
+ [](const MachineInstr &MI) { return MI.isCall(); })) &&
+ (!C.LRU.available(AArch64::LR) || !findRegisterToSaveLRTo(C));
+ });
+ }
}
// If we dropped all of the candidates, bail out here.
@@ -6616,6 +6670,9 @@ void AArch64InstrInfo::buildOutlinedFrame(
if (std::any_of(MBB.instr_begin(), MBB.instr_end(), IsNonTailCall)) {
// Fix up the instructions in the range, since we're going to modify the
// stack.
+
+ // Bugzilla ID: 46767
+ // TODO: Check if fixing up twice is safe so we can outline these.
assert(OF.FrameConstructionID != MachineOutlinerDefault &&
"Can only fix up stack references once");
fixupPostOutline(MBB);
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-2fixup-blr-terminator.mir b/llvm/test/CodeGen/AArch64/machine-outliner-2fixup-blr-terminator.mir
new file mode 100644
index 000000000000..2111704c0898
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-2fixup-blr-terminator.mir
@@ -0,0 +1,75 @@
+# RUN: llc -mtriple=aarch64--- -run-pass=machine-outliner \
+# RUN: -verify-machineinstrs %s -o - | FileCheck %s
+
+# CHECK-NOT: OUTLINED_FUNCTION
+
+--- |
+ define void @f1() #0 { ret void }
+ define void @f2() #0 { ret void }
+ define void @f3() #0 { ret void }
+ define void @f4() #0 { ret void }
+ attributes #0 = { minsize noredzone "branch-target-enforcement" }
+...
+---
+name: f1
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
+ $x20, $x19 = LDPXi $sp, 11
+ $x20, $x19 = LDPXi $sp, 12
+ $x20, $x19 = LDPXi $sp, 13
+ $x20, $x19 = LDPXi $sp, 14
+ $x20, $x19 = LDPXi $sp, 18
+ $x20, $x19 = LDPXi $sp, 19
+ $x20, $x19 = LDPXi $sp, 20
+ $x20, $x19 = LDPXi $sp, 21
+ BLR $x20, implicit $sp
+ bb.2:
+ liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
+ RET undef $lr
+...
+---
+name: f2
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
+ $x20, $x19 = LDPXi $sp, 11
+ $x20, $x19 = LDPXi $sp, 12
+ $x20, $x19 = LDPXi $sp, 13
+ $x20, $x19 = LDPXi $sp, 14
+ $x20, $x19 = LDPXi $sp, 18
+ $x20, $x19 = LDPXi $sp, 19
+ $x20, $x19 = LDPXi $sp, 20
+ $x20, $x19 = LDPXi $sp, 21
+ BLR $x20, implicit $sp
+ bb.2:
+ liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
+ RET undef $lr
+...
+---
+name: f3
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
+ $x20, $x19 = LDPXi $sp, 11
+ $x20, $x19 = LDPXi $sp, 12
+ $x20, $x19 = LDPXi $sp, 13
+ $x20, $x19 = LDPXi $sp, 14
+ $x20, $x19 = LDPXi $sp, 18
+ $x20, $x19 = LDPXi $sp, 19
+ $x20, $x19 = LDPXi $sp, 20
+ $x20, $x19 = LDPXi $sp, 21
+ BLR $x20, implicit $sp
+ bb.2:
+ liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
+ RET undef $lr
+...
+---
+name: f4
+tracksRegLiveness: true
+body: |
+ bb.0:
+ RET undef $lr
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-no-noreturn-no-stack.mir b/llvm/test/CodeGen/AArch64/machine-outliner-no-noreturn-no-stack.mir
new file mode 100644
index 000000000000..d935cebf2b06
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-no-noreturn-no-stack.mir
@@ -0,0 +1,67 @@
+# RUN: llc -mtriple=arm64-apple-ios -run-pass=prologepilog -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s
+
+# Noreturn functions conservatively need to save and restore lr.
+# When there is no available register, the stack is used at call site.
+# If the stack also needs to be set up for a call in the outlined function,
+# bail-out this case since we do not handle adjusting the stack twice.
+
+# CHECK: OUTLINED_FUNCTION
+
+--- |
+ @g = external global i32
+ define void @stack_1() #0 { ret void }
+ define void @stack_2() #0 { ret void }
+ define void @baz() { ret void }
+ attributes #0 = { noredzone }
+...
+---
+name: stack_1
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $x4, $x0, $x1, $x2, $x3
+ $w8 = MOVZWi 259, 0
+ $x9 = ADRP target-flags(aarch64-page) @g
+ $x9 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-nc) @g, 0
+ STRXui $x9, $x1, 0
+ STRHHui $w8, $x1, 8
+ $w8 = ORRWrs $wzr, $w4, 0, implicit-def $x8
+ STRXui $x8, $x3, 0
+ STPXi $x3, $xzr, $x2, 0
+ $w8 = MOVZWi 271, 0
+ STRHHui $w8, $x2, 8
+ $x8 = ORRXrs $xzr, $x0, 0
+ $x0 = ORRXrs $xzr, $x1, 0
+ $x1 = ORRXrs $xzr, $x2, 0
+ BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8, implicit-def $x9, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit-def $x13, implicit-def $x14, implicit-def $x15, implicit-def $x18
+ BRK 1
+...
+---
+name: stack_2
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $x4, $x0, $x1, $x2, $x3
+ $w8 = MOVZWi 259, 0
+ $x9 = ADRP target-flags(aarch64-page) @g
+ $x9 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-nc) @g, 0
+ STRXui $x9, $x1, 0
+ STRHHui $w8, $x1, 8
+ $w8 = ORRWrs $wzr, $w4, 0, implicit-def $x8
+ STRXui $x8, $x3, 0
+ STPXi $x3, $xzr, $x2, 0
+ $w8 = MOVZWi 271, 0
+ STRHHui $w8, $x2, 8
+ $x8 = ORRXrs $xzr, $x0, 0
+ $x0 = ORRXrs $xzr, $x1, 0
+ $x1 = ORRXrs $xzr, $x2, 0
+ BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8, implicit-def $x9, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit-def $x13, implicit-def $x14, implicit-def $x15, implicit-def $x18
+ BRK 1
+...
+---
+name: baz
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $w0, $lr, $w8
+ RET undef $lr
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir b/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
new file mode 100644
index 000000000000..b13887a02ebd
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
@@ -0,0 +1,67 @@
+# RUN: llc -mtriple=arm64-apple-ios -run-pass=prologepilog -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s
+
+# Noreturn functions conservatively need to save and restore lr.
+# When there is no available register, the stack is used at call site.
+# If the stack also needs to be set up for a call in the outlined function,
+# bail-out this case since we do not handle adjusting the stack twice.
+
+# CHECK-NOT: OUTLINED_FUNCTION
+
+--- |
+ @g = external global i32
+ define void @stack_1() #0 { ret void }
+ define void @stack_2() #0 { ret void }
+ define void @baz() { ret void }
+ attributes #0 = { noredzone noreturn }
+...
+---
+name: stack_1
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $x4, $x0, $x1, $x2, $x3
+ $w8 = MOVZWi 259, 0
+ $x9 = ADRP target-flags(aarch64-page) @g
+ $x9 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-nc) @g, 0
+ STRXui $x9, $x1, 0
+ STRHHui $w8, $x1, 8
+ $w8 = ORRWrs $wzr, $w4, 0, implicit-def $x8
+ STRXui $x8, $x3, 0
+ STPXi $x3, $xzr, $x2, 0
+ $w8 = MOVZWi 271, 0
+ STRHHui $w8, $x2, 8
+ $x8 = ORRXrs $xzr, $x0, 0
+ $x0 = ORRXrs $xzr, $x1, 0
+ $x1 = ORRXrs $xzr, $x2, 0
+ BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8, implicit-def $x9, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit-def $x13, implicit-def $x14, implicit-def $x15, implicit-def $x18
+ BRK 1
+...
+---
+name: stack_2
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $x4, $x0, $x1, $x2, $x3
+ $w8 = MOVZWi 259, 0
+ $x9 = ADRP target-flags(aarch64-page) @g
+ $x9 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-nc) @g, 0
+ STRXui $x9, $x1, 0
+ STRHHui $w8, $x1, 8
+ $w8 = ORRWrs $wzr, $w4, 0, implicit-def $x8
+ STRXui $x8, $x3, 0
+ STPXi $x3, $xzr, $x2, 0
+ $w8 = MOVZWi 271, 0
+ STRHHui $w8, $x2, 8
+ $x8 = ORRXrs $xzr, $x0, 0
+ $x0 = ORRXrs $xzr, $x1, 0
+ $x1 = ORRXrs $xzr, $x2, 0
+ BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8, implicit-def $x9, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit-def $x13, implicit-def $x14, implicit-def $x15, implicit-def $x18
+ BRK 1
+...
+---
+name: baz
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $w0, $lr, $w8
+ RET undef $lr
More information about the llvm-commits
mailing list