<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/58961>58961</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[AArch64][FrameDestroy] Wrong stack adjustment instruction placement
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
zhroma
</td>
</tr>
</table>
<pre>
This was initially found on a patched LLVM with a few another calling conventions implemented.
To show the bug lets apply the following one-liner patch first:
```diff
diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.td b/llvm/lib/Target/AArch64/AArch64CallingConvention.td
--- a/llvm/lib/Target/AArch64/AArch64CallingConvention.td
+++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.td
@@ -487,6 +487,7 @@ def CSR_AArch64_AllRegs
def CSR_AArch64_NoRegs : CalleeSavedRegs<(add)>;
def CSR_AArch64_RT_MostRegs : CalleeSavedRegs<(add CSR_AArch64_AAPCS,
+ (sequence "D%u", 0, 31),
(sequence "X%u", 9, 15))>;
def CSR_AArch64_StackProbe_Windows
```
It just adds all `d0-d31` registers into callee-saved set for `preserve_most` convention on aarch64.
After applying the patch (checked v15.0.3 and recent main at e044796132abd9e10e21093822dc234003628fbd) lets proceed with compiling the following simple snippet, like `clang test.ll --target=arm64 -S -g0`
```llvm
declare void @bar()
define preserve_mostcc void @foo() {
call void @bar()
ret void
}
```
Assembly file shows the following:
```asm
sub sp, sp, #272
str x15, [sp] // 8-byte Folded Spill
stp x14, x13, [sp, #16] // 16-byte Folded Spill
stp x12, x11, [sp, #32] // 16-byte Folded Spill
stp x10, x9, [sp, #48] // 16-byte Folded Spill
stp d31, d30, [sp, #64] // 16-byte Folded Spill
stp d29, d28, [sp, #80] // 16-byte Folded Spill
stp d27, d26, [sp, #96] // 16-byte Folded Spill
stp d25, d24, [sp, #112] // 16-byte Folded Spill
stp d23, d22, [sp, #128] // 16-byte Folded Spill
stp d21, d20, [sp, #144] // 16-byte Folded Spill
stp d19, d18, [sp, #160] // 16-byte Folded Spill
stp d17, d16, [sp, #176] // 16-byte Folded Spill
stp d7, d6, [sp, #192] // 16-byte Folded Spill
stp d5, d4, [sp, #208] // 16-byte Folded Spill
stp d3, d2, [sp, #224] // 16-byte Folded Spill
stp d1, d0, [sp, #240] // 16-byte Folded Spill
stp x29, x30, [sp, #256] // 16-byte Folded Spill
bl bar
ldp x29, x30, [sp, #256] // 16-byte Folded Reload
ldp d1, d0, [sp, #240] // 16-byte Folded Reload
ldp d3, d2, [sp, #224] // 16-byte Folded Reload
ldp d5, d4, [sp, #208] // 16-byte Folded Reload
ldp d7, d6, [sp, #192] // 16-byte Folded Reload
ldp d17, d16, [sp, #176] // 16-byte Folded Reload
ldp d19, d18, [sp, #160] // 16-byte Folded Reload
ldp d21, d20, [sp, #144] // 16-byte Folded Reload
ldp d23, d22, [sp, #128] // 16-byte Folded Reload
ldp d25, d24, [sp, #112] // 16-byte Folded Reload
ldp d27, d26, [sp, #96] // 16-byte Folded Reload
ldp d29, d28, [sp, #80] // 16-byte Folded Reload
ldp d31, d30, [sp, #64] // 16-byte Folded Reload
ldp x10, x9, [sp, #48] // 16-byte Folded Reload
ldp x12, x11, [sp, #32] // 16-byte Folded Reload
ldp x14, x13, [sp, #16] // 16-byte Folded Reload
add sp, sp, #272
ldr x15, [sp] // 8-byte Folded Reload
ret
```
You can see register `x15` is restored from the wrong stack slot - to make it work, `add sp, sp, #272` instruction must be placed after the last load instead.
Locally workarounded with the following:
```diff
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -1184,12 +1184,14 @@ static MachineBasicBlock::iterator convertCalleeSaveRestoreToSPPrePostIncDec(
MachineFunction &MF = *MBB.getParent();
if (MBBI->getOperand(MBBI->getNumOperands() - 1).getImm() != 0 ||
CSStackSizeInc < MinOffset || CSStackSizeInc > MaxOffset) {
- emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
+ MachineBasicBlock::iterator MBBIn =
+ FrameFlag == MachineInstr::FrameDestroy ? std::next(MBBI) : MBBI;
+ emitFrameOffset(MBB, MBBIn, DL, AArch64::SP, AArch64::SP,
StackOffset::getFixed(CSStackSizeInc), TII, FrameFlag,
false, false, nullptr, EmitCFI,
StackOffset::getFixed(CFAOffset));
- return std::prev(MBBI);
+ return std::prev(MBBIn);
}
MachineInstrBuilder MIB = BuildMI(MBB, MBBI, DL, TII->get(NewOpc));
```
but have no idea whether this solution is full and proper.
CC @TNorthover as author of that code part.
PS. On stock compiler I've tried to compile all available LLVM tests including the whole test suite for arm64 target - compiler never reaches this particular `emitFrameOffset` call.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJytWVtv4jgU_jXpixWUOBDggQcug4TUzlSl2tl9qkzigLcmzsZO6eyv33NsoBDa7JQURcSJ7c_nbp-TlUp_jR43QpMd00Tkwggm5S-SqSpPicoJIwUzyYan5Pb2jzuyE2YD7zK-IyxXZsNLksAEka9JovIXnhuhcgDaFpJv4YmnHS-YecHY_T8qojdqR2AiWVVrIrnRhBUFrIivMiWl2iGYyrkPqABvlyeZKLXxovEpmBcH7kpFlrlX2CK-vxaGMI_OpXzZ4k2s4P-RlWtuoDEel8km7r61po6D6ZGBjknJqt18R4_v-20J2fNKJ-76Grq8bgAX8buDvkenMQFg1-yTfVfKMzJdPjztoZ7GUj7wtXbTL3q_K-wkoCCCi3K-ZC88tROiqUcHLE09OvSib140OVXhBdDD49Od0uaI9iHcOXHj--kSqD_KinzyB5ia_1PxPOHQpjOP9iq4AyQJ8C8KkfzDAp9Fr8P_eQKPqCTsWfjfEdDSsOT5vlQr_vRT5Kna6ZoznM5eGPJ3pcEX0hTcTEqCzhL4KbATB6Tka6ENL9HvjbJ-zLmvUdREcwPeWOKEouSaly_8aQuKwXlvjm4DBLOEnbn5OANY59fozOjazo1BEBBLkmdY4SXsdYJOBGEkBUoSACRbJgDPEB50u_1hHEaUrdIhDwNOw2AYDShNExp1gyCK6SBboUm5CFKUKuGAaaNToraFkId130KKtkGJ6FwUBfrLlEjxzJHDRDIczbXpgIx83ziPimas3MZd4i-Jv66L9ihv64su-HAAKjl5USJFN1qxEvhFtZ7MA31CWCNnQk2S45xMKTeHeP3JwdpQMx-hEhCesZ37VfqzBoMYa823K4zvAkUBoVifS-mjEMv09tz2dbVy9wIl6f49GtE-rY0zpb2_oo3DkN4ExvZmHzrKHC4y8Fe_DCdzJVPQ6hLUKeuoxR61i6ivYfQG7ggJ4_oqe-ww_n1w6sDDOnhEvwDcBpbXYR27O7gQz6fB0cEBLY2COjpsD20pT20khNugDj4IvgC878DjOviwvUJT2nPg3QtrCesavQI8cuD0ApzWVXoFuFMovVBo2K1r9PPgoVNoeKHQMK5r9Apwp9DwQqFhv67Rz4M77Evo4YWDXgHujOXCVmjwJR7q9HkBTi8c9BqZW_ALW6HdCwe9InA593-9jC20d7VCV3J_h83trEOmX7XqA5eKpe-jtxdYI3prXTeitzbTRvTWHtYs97bBoRm9bVxrRG8dkpvR2-4mzehtN8Jm9LZ7eDN62-NHs6-2PTk1obc_9DWjtz2vNqO3PWq_h46ZPP7-L42Qaes04r3lIXdqyJb-UhXkXjkkw_yYLGO-iERAHiw0vNVGlQCelWprM6ldqTDXxCydaKkM8Qnk1lsGmaYwZKfKZ0s_ZFTA-TtcI24OSVOV2AR7i-n7CvJFyRJYhtnEGteRDDqQITucs_M6261KbBkP12Ml1vIO2fFvZXttC2rzkm35rdrxEhbpJEXxmarVxeSrSmkfoFxVR_sIa19EC8MBukZIsYx2eOgeCmlgCkYk5I4lG0j8J0yLZCJV8oyyj8YC9MnAhFxRpTRv5a4HZ1qPanl_X_J7pc0iT2Y8weT_YMN70HmVO2vxaHw3J140g9b4bjLpAEP3rOS52ZcMomNNgYgMqzEwaOF70TcY-KMAUvL0_OX3art_r_d1CZ9gLQyRF9vtoVYBeyEsGhCvP8Xr1MemS1uyWop_OTAAxE3Jnch_ZBmWmNz4yzHfgLVXN-asFuIjIt8KY1VyGIAEowsh3Xif3eL_QY1Wzsv791-dVQz_T0eIn6N4a3VGS8xcsrXtA0HsgRboyA7DDpmBSkv1C0bNwSxS15Pz1z0LC8tqNHZ8HOuAbpVGpvOruH43dFpF7NewE0DRc_HK0S7O1eQqouRxYWV-lEEzfMak5jj-2MgrKQsQEzS_AY_T-aIFgfPx0WhOzf3EeCDiV2X-Jv6i5C9v4q9LvWF0XvOnt8JbzTutFUwqAZsQmNBiYv3TPt8tPrJdEOreAWHEd777USR1ls43rFVlyAaiBskVESlnZLfh9uuMwW87WsnKBghoZyBvW3QtSgWefbZtTKcYtR6_q9Js1AuWcDVhFbRLojKAYgbiVIrF3NKcTbxfdsgPlBM4zb4AC7OBoT6QZEoBuw9WmF2HrUSzFyYkW8GT_a6ExVesQyeySg-l291GQTf2EF2BE9qCtCvJugotxKLjWjlHeksOMufacY1UiqSSzG7adQfCQjYQ0rnhozCOB1Gv24_jm3QUpcNoyG6MMJKP4Jxx8J7eDB5O3RiPHz9PdnuWYq0dP3qd7eB258a3N1UpRxtjCo22ZE8osK1uqlUHmDjZjODmg27-5gluRkLrikPsnfcGwzi82YyGvX43zeIoC7MA3tBBLxsMuqw_HMI7GgY3IFUuNZLuUZrzHbEQ-LmhN7sRIxpQGoZhFPZ6tBt36JBlQToIuhEPV2HIQf98C6rpIB0dVa5vypElaVWtNXRKOAXpt06mtVjn3EoK8Z21jP7dwHmI3diVR5by_wCE97Ae">