<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - __attribute__((interrupt)) for x86-64 breaks stack alignment when ISR has an error code"
href="https://bugs.llvm.org/show_bug.cgi?id=45215">45215</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>__attribute__((interrupt)) for x86-64 breaks stack alignment when ISR has an error code
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>uchan0@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=23240" name="attach_23240" title="sample code of an ISR">attachment 23240</a> <a href="attachment.cgi?id=23240&action=edit" title="sample code of an ISR">[details]</a></span>
sample code of an ISR
Clang generates incorrect instructions for a x86-64 interrupt service routine
(ISR).
Inside X86FrameLowering::emitPrologue(), a code snippet tries to adjust stack
alignment to meet x86-64 ABI when it is an ISR with 2 arguments (a pointer to
stack frame and an error code).
The code snippet is
<a href="https://github.com/llvm/llvm-project/blob/c5ff3df839321847ab7558ffb292f725d0356dfe/llvm/lib/Target/X86/X86FrameLowering.cpp#L1222">https://github.com/llvm/llvm-project/blob/c5ff3df839321847ab7558ffb292f725d0356dfe/llvm/lib/Target/X86/X86FrameLowering.cpp#L1222</a>
This snippet is originally introduced by this commit
<a href="https://github.com/llvm/llvm-project/commit/0389f62879fc9162fae12e57b93ae3565c915356">https://github.com/llvm/llvm-project/commit/0389f62879fc9162fae12e57b93ae3565c915356</a>
In conclusion, I suggest to remove `StackSize += 8;`.
## Background
X86-64 ABI requires the stack pointer is aligned on 16 byte boundary to call
another function.
In other words, the value (%rsp + 8) must be a multiple of 16 when control is
transferred to the function entry point.
This rule avoids some instructions which require 16 byte alignment emitting a
general protection fault.
This rule is for normal functions, not for ISRs.
Intel SDM Vol.3 6.14.2 says "In IA-32e mode, the RSP is aligned to a 16-byte
boundary before pushing the stack frame."
In addition to this, CPU will push 5 registers and an optional value on to the
stack: %ss, %rsp, %rflags, %cs, %rip, error code.
In the case that an error code is pushed, %rsp shall be aligned on 16 byte
boundary when control is transferred to the ISR.
## Problem
Alignment-sensitive instructions inside a function called from an ISR will
generate a general protection fault, because X86FrameLowering::emitPrologue()
emits incorrect stack subtraction code like `subq $264, %rsp`.
Assume the following code:
// isr.cpp
void bar(void*);
__attribute__((interrupt))
void foo(void* frame, unsigned long error_code) {
bar(frame);
}
`clang++ -S -o isr.s isr.cpp` (with Clang 11.0.0 github master version) will
generate the following instructions. Debug lines are omitted.
.text
.file "isr.cpp"
.globl _Z3fooPvm # -- Begin function _Z3fooPvm
.p2align 4, 0x90
.type _Z3fooPvm,@function
_Z3fooPvm: # @_Z3fooPvm
.L_Z3fooPvm$local:
# %bb.0: # %entry
pushq %rax
pushq %rbp
movq %rsp, %rbp
pushq %rax
pushq %r11
pushq %r10
pushq %r9
pushq %r8
pushq %rdi
pushq %rsi
pushq %rdx
pushq %rcx
subq $288, %rsp # imm = 0x120
movaps %xmm15, -96(%rbp) # 16-byte Spill
movaps %xmm14, -112(%rbp) # 16-byte Spill
movaps %xmm13, -128(%rbp) # 16-byte Spill
movaps %xmm12, -144(%rbp) # 16-byte Spill
movaps %xmm11, -160(%rbp) # 16-byte Spill
movaps %xmm10, -176(%rbp) # 16-byte Spill
movaps %xmm9, -192(%rbp) # 16-byte Spill
movaps %xmm8, -208(%rbp) # 16-byte Spill
movaps %xmm7, -224(%rbp) # 16-byte Spill
movaps %xmm6, -240(%rbp) # 16-byte Spill
movaps %xmm5, -256(%rbp) # 16-byte Spill
movaps %xmm4, -272(%rbp) # 16-byte Spill
movaps %xmm3, -288(%rbp) # 16-byte Spill
movaps %xmm2, -304(%rbp) # 16-byte Spill
movaps %xmm1, -320(%rbp) # 16-byte Spill
movaps %xmm0, -336(%rbp) # 16-byte Spill
cld
movq 16(%rbp), %rax
leaq 24(%rbp), %rcx
movq %rcx, -344(%rbp)
movq -344(%rbp), %rdi
movq %rax, -352(%rbp) # 8-byte Spill
callq _Z3barPv
movaps -336(%rbp), %xmm0 # 16-byte Reload
movaps -320(%rbp), %xmm1 # 16-byte Reload
...
Since the number of `pushq` is 11, the value (%rsp + 8) is a multiple of 16
just before `subq $288, %rsp`. And it holds just after the subtraction because
subtracting 288 (which is multiple of 16) from the stack pointer does not
affect the alignment.
As a result, `callq` instruction will be called on misaligned stack, then the
value (%rsp + 8) is NOT a multiple of 16 when control is transferred to the
function entry point.
## Solution
I experimented 2 solutions:
1. remove `StackSize += 8;`
2. remove `emitSPUpdate(MBB, MBBI, DL, -8, /*InEpilogue=*/false);`
Solution #1 looks nice, while #2 breaks some instructions.
Solution #1 gives us the following code:
.text
<same as original code>
subq $280, %rsp # imm = 0x118 <---- DIFF
<same as original code>
callq _Z3barPv
...
Only 2 lines are different: `subq` and its complementary instruction `addq`
(not shown in the code).
In this code, the `callq` is executed on the well-aligned stack.
Solution #2 gives us the following code:
.text
.file "isr.cpp"
.globl _Z3fooPvm # -- Begin function _Z3fooPvm
.p2align 4, 0x90
.type _Z3fooPvm,@function
_Z3fooPvm: # @_Z3fooPvm
.L_Z3fooPvm$local:
# %bb.0: # %entry
pushq %rbp <------ `pushq %rax` has been vanished
movq %rsp, %rbp
pushq %rax
pushq %r11
pushq %r10
pushq %r9
pushq %r8
pushq %rdi
pushq %rsi
pushq %rdx
pushq %rcx
subq $288, %rsp # imm = 0x120 <---- same as original
code
movaps %xmm15, -96(%rbp) # 16-byte Spill
movaps %xmm14, -112(%rbp) # 16-byte Spill
movaps %xmm13, -128(%rbp) # 16-byte Spill
movaps %xmm12, -144(%rbp) # 16-byte Spill
movaps %xmm11, -160(%rbp) # 16-byte Spill
movaps %xmm10, -176(%rbp) # 16-byte Spill
movaps %xmm9, -192(%rbp) # 16-byte Spill
movaps %xmm8, -208(%rbp) # 16-byte Spill
movaps %xmm7, -224(%rbp) # 16-byte Spill
movaps %xmm6, -240(%rbp) # 16-byte Spill
movaps %xmm5, -256(%rbp) # 16-byte Spill
movaps %xmm4, -272(%rbp) # 16-byte Spill
movaps %xmm3, -288(%rbp) # 16-byte Spill
movaps %xmm2, -304(%rbp) # 16-byte Spill
movaps %xmm1, -320(%rbp) # 16-byte Spill
movaps %xmm0, -336(%rbp) # 16-byte Spill
<same as original code>
callq _Z3barPv
In this code, the number of `pushq` is 10 because `pushq %rax` is vanished. So
%rsp shall be a multiple of 16 when `callq` is executed.
A problem is that the displacements of `movaps` (such as -96 and -112) are
multiple of 16, although the value %rbp is not a multiple of 16.
`movaps` generates a general protection fault when a memory address (%rsp - 96)
is not aligned on 16 byte boundary.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>