[llvm-bugs] [Bug 33913] New: clang corrupts rsp when used as output constraint (Linux kernel >= 4.6 inline asm)

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Jul 24 11:59:47 PDT 2017


https://bugs.llvm.org/show_bug.cgi?id=33913

            Bug ID: 33913
           Summary: clang corrupts rsp when used as output constraint
                    (Linux kernel >= 4.6 inline asm)
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: manojgupta at google.com
                CC: llvm-bugs at lists.llvm.org

Reported by mka at chromium.org.
chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=737659

The following upstream kernel commit intends to forces a stack frame before
inline assembly code if it doesn't already exist:


commit f05058c4d652b619adfda6c78d8f5b341169c264
Author: Chris J Arges <chris.j.arges at canonical.com>
Date:   Thu Jan 21 16:49:25 2016 -0600

    x86/uaccess: Add stack frame output operand in get_user() inline asm

    Numerous 'call without frame pointer save/setup' warnings are introduced
    by stacktool because of functions using the get_user() macro. Bad stack
    traces could occur due to lack of or misplacement of stack frame setup
    code.

    This patch forces a stack frame to be created before the inline asm code
    if CONFIG_FRAME_POINTER is enabled by listing the stack pointer as an
    output operand for the get_user() inline assembly statement.

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index a4a30e4b2d34..9bbb3b2d0372 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -179,10 +179,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL),
0ULL, 0UL))
 ({                                                                     \
        int __ret_gu;                                                   \
        register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);            \
+       register void *__sp asm(_ASM_SP);                               \
        __chk_user_ptr(ptr);                                            \
        might_fault();                                                  \
-       asm volatile("call __get_user_%P3"                              \
-                    : "=a" (__ret_gu), "=r" (__val_gu)                 \
+       asm volatile("call __get_user_%P4"                              \
+                    : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp)    \
                     : "0" (ptr), "i" (sizeof(*(ptr))));                \


This inline asm causes double fault when compiled with clang.

Analysis by Josh Poimboeuf:

  Here's the reason for the double fault.  First it puts zero on the stack
  at offset -0x58:

  > ffffffff81367616:     31 c0                   xor    %eax,%eax
  > ffffffff81367618:     48 89 45 c8             mov    %rax,-0x38(%rbp)
  > ffffffff8136761c:     45 31 ff                xor    %r15d,%r15d
  > ffffffff8136761f:     48 89 45 a8             mov    %rax,-0x58(%rbp)

  Then, later, it copies that zeroed word from the stack to RSP:

  > ffffffff81367874:     48 8b 65 a8             mov    -0x58(%rbp),%rsp

  Then it double faults because the call instruction tries to write RIP on
  the stack, but RSP is zero:

  > ffffffff81367878:     e8 73 26 f1 ff          callq  ffffffff81279ef0
<__get_user_4>

  Then clang tries to put RSP's value on the stack, at the same stack slot
  where the original zero was stored (though it never reaches this point):

  > ffffffff8136787d:     49 89 d4                mov    %rdx,%r12
  > ffffffff81367880:     48 89 65 a8             mov    %rsp,-0x58(%rbp)

  The panic is consistent with the above.  RIP points to the call
  instruction, RSP is zero:

  > [    3.798722] PANIC: double fault, error_code: 0x0
  > [    3.807387] CPU: 1 PID: 605 Comm: frecon Not tainted
4.12.0-00023-g711d82c128ff #107
  > [    3.816040] Hardware name: GOOGLE Squawks, BIOS
Google_Squawks.5216.152.76 03/04/2016
  > [    3.824792] task: ffff880075b92f00 task.stack: ffffc90000d6c000
  > [    3.829599] EXT4-fs (mmcblk0p1): re-mounted. Opts:
commit=600,data=ordered
  > [    3.839092] RIP: 0010:drm_mode_setcrtc+0x328/0x51f
  > [    3.844443] RSP: 0018:0000000000000000 EFLAGS: 00010206
  > [    3.850280] RAX: 0000559e707c4d60 RBX: 0000000000000000 RCX:
0000000000000008
  > [    3.858253] RDX: 0000000000000001 RSI: ffffc90000d6fcc8 RDI:
ffffffff81367805
  > [    3.866225] RBP: ffffc90000d6fd90 R08: 00000000014000c0 R09:
0000000000000308
  > [    3.874199] R10: 0000000000000300 R11: 0000000000000556 R12:
0000000000000000
  > [    3.882163] R13: ffff880077a25000 R14: ffffc90000d6fdd0 R15:
0000000000000000
  > [    3.890136] FS:  00007fb2dbd62740(0000) GS:ffff88007ad00000(0000)
knlGS:0000000000000000
  > [    3.899177] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  > [    3.905595] CR2: fffffffffffffff8 CR3: 000000007596b000 CR4:
00000000001006e0
  > [    3.913568] Call Trace:
  > [    3.916296] Code: b8 48 8d b5 38 ff ff ff 41 8b 56 08 39 d3 0f 83 85 00
00 00 4c 63 fb 4a 83 24 f8 00 4a 8d 04 bd 00 00 00 00 49 03 06 48 8b 65 a8 <e8>
73 26 f1 ff 49 89 d4 48 89 65 a8 85 c0 0f 85 a0 00 00 00 ba
  > [    3.937440] Kernel panic - not syncing: Machine halted.
  > [    3.943268] CPU: 1 PID: 605 Comm: frecon Not tainted
4.12.0-00023-g711d82c128ff #107
  > [    3.951921] Hardware name: GOOGLE Squawks, BIOS
Google_Squawks.5216.152.76 03/04/2016
  > [    3.960671] Call Trace:
  > [    3.963398]  <#DF>
  > [    3.965637]  __dump_stack+0x19/0x1b
  > [    3.969531]  dump_stack+0x42/0x60

  clang is obviously getting confused by the RSP output constraint.  I
  think it tries to take the constraint literally, since it takes RSP as
  an output from the inline asm and stores it on the stack.  However, that
  behavior doesn't really make sense for a "register" variable.  It also
  doesn't explain why it's zeroing the register out first.

Link with the discussion: https://patchwork.kernel.org/patch/9837437/

More info:
 there are two separate issues here.

  1) The first issue is whether it's supported behavior to specify RSP as
     an output constraint in order to force GCC to create a stack frame.
     As far as I know, this is a quirk of GCC, and not really considered
     defined behavior.

     However, the idea was suggested by some GCC developers:

       https://gcc.gnu.org/ml/gcc/2015-07/msg00079.html

     So at least it seems to be endorsed by GCC to some degree.  If you
     need details on why it works, that thread has the details.

  2) The second issue is whether clang should corrupt RSP.  I don't see a
     reason for clang to do that.  IMO, when using a local register
     variable as an input or output to inline asm, the compiler should
     leave the contents of the register alone.

     FWIW, my reading of the GCC manual seems to support that:

      
https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170724/b8aca843/attachment.html>


More information about the llvm-bugs mailing list