[llvm-bugs] [Bug 45215] New: attribute((interrupt)) for x86-64 breaks stack alignment when ISR has an error code

Mon Mar 16 06:58:35 PDT 2020

https://bugs.llvm.org/show_bug.cgi?id=45215

            Bug ID: 45215
           Summary: __attribute__((interrupt)) for x86-64 breaks stack
                    alignment when ISR has an error code
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: uchan0 at gmail.com
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, spatel+llvm at rotateright.com

Created attachment 23240
  --> https://bugs.llvm.org/attachment.cgi?id=23240&action=edit
sample code of an ISR

Clang generates incorrect instructions for a x86-64 interrupt service routine
(ISR).

Inside X86FrameLowering::emitPrologue(), a code snippet tries to adjust stack
alignment to meet x86-64 ABI when it is an ISR with 2 arguments (a pointer to
stack frame and an error code).

The code snippet is
https://github.com/llvm/llvm-project/blob/c5ff3df839321847ab7558ffb292f725d0356dfe/llvm/lib/Target/X86/X86FrameLowering.cpp#L1222
This snippet is originally introduced by this commit
https://github.com/llvm/llvm-project/commit/0389f62879fc9162fae12e57b93ae3565c915356

In conclusion, I suggest to remove `StackSize += 8;`.

## Background

X86-64 ABI requires the stack pointer is aligned on 16 byte boundary to call
another function.
In other words, the value (%rsp + 8) must be a multiple of 16 when control is
transferred to the function entry point.
This rule avoids some instructions which require 16 byte alignment emitting a
general protection fault.

This rule is for normal functions, not for ISRs.
Intel SDM Vol.3 6.14.2 says "In IA-32e mode, the RSP is aligned to a 16-byte
boundary before pushing the stack frame."
In addition to this, CPU will push 5 registers and an optional value on to the
stack: %ss, %rsp, %rflags, %cs, %rip, error code.
In the case that an error code is pushed, %rsp shall be aligned on 16 byte
boundary when control is transferred to the ISR.

## Problem

Alignment-sensitive instructions inside a function called from an ISR will
generate a general protection fault, because X86FrameLowering::emitPrologue()
emits incorrect stack subtraction code like `subq $264, %rsp`.

Assume the following code:

// isr.cpp
void bar(void*);
__attribute__((interrupt))
void foo(void* frame, unsigned long error_code) {
  bar(frame);
}

`clang++ -S -o isr.s isr.cpp` (with Clang 11.0.0 github master version) will
generate the following instructions. Debug lines are omitted.

        .text
        .file   "isr.cpp"
        .globl  _Z3fooPvm               # -- Begin function _Z3fooPvm
        .p2align        4, 0x90
        .type   _Z3fooPvm, at function
_Z3fooPvm:                              # @_Z3fooPvm
.L_Z3fooPvm$local:
# %bb.0:                                # %entry
        pushq   %rax
        pushq   %rbp
        movq    %rsp, %rbp
        pushq   %rax
        pushq   %r11
        pushq   %r10
        pushq   %r9
        pushq   %r8
        pushq   %rdi
        pushq   %rsi
        pushq   %rdx
        pushq   %rcx
        subq    $288, %rsp              # imm = 0x120
        movaps  %xmm15, -96(%rbp)       # 16-byte Spill
        movaps  %xmm14, -112(%rbp)      # 16-byte Spill
        movaps  %xmm13, -128(%rbp)      # 16-byte Spill
        movaps  %xmm12, -144(%rbp)      # 16-byte Spill
        movaps  %xmm11, -160(%rbp)      # 16-byte Spill
        movaps  %xmm10, -176(%rbp)      # 16-byte Spill
        movaps  %xmm9, -192(%rbp)       # 16-byte Spill
        movaps  %xmm8, -208(%rbp)       # 16-byte Spill
        movaps  %xmm7, -224(%rbp)       # 16-byte Spill
        movaps  %xmm6, -240(%rbp)       # 16-byte Spill
        movaps  %xmm5, -256(%rbp)       # 16-byte Spill
        movaps  %xmm4, -272(%rbp)       # 16-byte Spill
        movaps  %xmm3, -288(%rbp)       # 16-byte Spill
        movaps  %xmm2, -304(%rbp)       # 16-byte Spill
        movaps  %xmm1, -320(%rbp)       # 16-byte Spill
        movaps  %xmm0, -336(%rbp)       # 16-byte Spill
        cld
        movq    16(%rbp), %rax
        leaq    24(%rbp), %rcx
        movq    %rcx, -344(%rbp)
        movq    -344(%rbp), %rdi
        movq    %rax, -352(%rbp)        # 8-byte Spill
        callq   _Z3barPv
        movaps  -336(%rbp), %xmm0       # 16-byte Reload
        movaps  -320(%rbp), %xmm1       # 16-byte Reload
    ...

Since the number of `pushq` is 11, the value (%rsp + 8) is a multiple of 16
just before `subq $288, %rsp`. And it holds just after the subtraction because
subtracting 288 (which is multiple of 16) from the stack pointer does not
affect the alignment.
As a result, `callq` instruction will be called on misaligned stack, then the
value (%rsp + 8) is NOT a multiple of 16 when control is transferred to the
function entry point.

## Solution

I experimented 2 solutions:
1. remove `StackSize += 8;`
2. remove `emitSPUpdate(MBB, MBBI, DL, -8, /*InEpilogue=*/false);`

Solution #1 looks nice, while #2 breaks some instructions.

Solution #1 gives us the following code:

        .text
        <same as original code>
        subq    $280, %rsp              # imm = 0x118    <---- DIFF
        <same as original code>
        callq   _Z3barPv
        ...

Only 2 lines are different: `subq` and its complementary instruction `addq`
(not shown in the code).
In this code, the `callq` is executed on the well-aligned stack.

Solution #2 gives us the following code:

        .text
        .file   "isr.cpp"
        .globl  _Z3fooPvm               # -- Begin function _Z3fooPvm
        .p2align        4, 0x90
        .type   _Z3fooPvm, at function
_Z3fooPvm:                              # @_Z3fooPvm
.L_Z3fooPvm$local:
# %bb.0:                                # %entry
        pushq   %rbp           <------ `pushq %rax` has been vanished
        movq    %rsp, %rbp
        pushq   %rax
        pushq   %r11
        pushq   %r10
        pushq   %r9
        pushq   %r8
        pushq   %rdi
        pushq   %rsi
        pushq   %rdx
        pushq   %rcx
        subq    $288, %rsp              # imm = 0x120    <---- same as original
code
        movaps  %xmm15, -96(%rbp)       # 16-byte Spill
        movaps  %xmm14, -112(%rbp)      # 16-byte Spill
        movaps  %xmm13, -128(%rbp)      # 16-byte Spill
        movaps  %xmm12, -144(%rbp)      # 16-byte Spill
        movaps  %xmm11, -160(%rbp)      # 16-byte Spill
        movaps  %xmm10, -176(%rbp)      # 16-byte Spill
        movaps  %xmm9, -192(%rbp)       # 16-byte Spill
        movaps  %xmm8, -208(%rbp)       # 16-byte Spill
        movaps  %xmm7, -224(%rbp)       # 16-byte Spill
        movaps  %xmm6, -240(%rbp)       # 16-byte Spill
        movaps  %xmm5, -256(%rbp)       # 16-byte Spill
        movaps  %xmm4, -272(%rbp)       # 16-byte Spill
        movaps  %xmm3, -288(%rbp)       # 16-byte Spill
        movaps  %xmm2, -304(%rbp)       # 16-byte Spill
        movaps  %xmm1, -320(%rbp)       # 16-byte Spill
        movaps  %xmm0, -336(%rbp)       # 16-byte Spill
        <same as original code>
        callq   _Z3barPv

In this code, the number of `pushq` is 10 because `pushq %rax` is vanished. So
%rsp shall be a multiple of 16 when `callq` is executed.
A problem is that the displacements of `movaps` (such as -96 and -112) are
multiple of 16, although the value %rbp is not a multiple of 16.
`movaps` generates a general protection fault when a memory address (%rsp - 96)
is not aligned on 16 byte boundary.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200316/fc2f1911/attachment.html>

[llvm-bugs] [Bug 45215] New: __attribute__((interrupt)) for x86-64 breaks stack alignment when ISR has an error code

[llvm-bugs] [Bug 45215] New: attribute((interrupt)) for x86-64 breaks stack alignment when ISR has an error code