[llvm-dev] retpoline mitigation and 6.0

Reid Kleckner via llvm-dev llvm-dev at lists.llvm.org
Fri Feb 9 11:24:26 PST 2018


I haven't read the all the emails in full detail, but it seems pretty clear
that  __x86_indirect_thunk and __llvm_retpoline_push do not do the same
things. It sounds like __llvm_retpoline_push is equivalent to
__x86_indirect_thunk except first it swaps the two words on the top of the
stack.

I arranged it this way because the x86 call instruction puts the intended
return address on the top of the stack, and there's no easy way to put it
anywhere else. We use this thunk when we want to make an indirect call and
there are no available scratch registers, i.e. 32-bit -mregparm=3 and the
call has three or more arguments, which happens in Linux. One way to avoid
this would be to compile with -mregparm=2, but that would pessimize direct
calls unnecessarily.

It sounds like Linux is already providing its own thunks, so it might be
better to for us to go back to the __llvm_retpoline_push external thunk
name and then get Linux to provide that thunk ifdef __clang__.

On Fri, Feb 9, 2018 at 7:37 AM, David Woodhouse via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> On Fri, 2018-02-09 at 12:54 +0000, David Woodhouse wrote:
> > On Fri, 2018-02-09 at 10:36 +0000, David Woodhouse wrote:
> > >
> > > Did you get anywhere with the function attribute? Having isolated the
> > > next boot failure to "it goes away if I compile io_apic.c without
> > > retpoline", bisecting it per-function would help to further delay the
> > > bit where I actually have to start *thinking*...
> >
> > It's mp_register_ioapic(), and only when io_apic_unique_id() gets
> > inlined, at which point bad things start happening.
> >
> > [    0.000000] mp_register_ioapic, 0 fec00000 0 c1b2fe88
> > [    0.000000] At line 412, gsi_base is 0
> > [    0.000000] At line 425, gsi_base is -1043715332
> > [    0.000000] At line 427, gsi_base is -1043715332
> >
> > http://git.infradead.org/users/dwmw2/linux-retpoline.git/shortlog/ref
> > s/heads/clang
> > http://david.woodhou.se/clang32.config
> >
> >
> > http://david.woodhou.se/io_apic_b.i
> > http://david.woodhou.se/io_apic_b.noretpoline.s
> > http://david.woodhou.se/io_apic_b.retpoline.s
> >
> > I don't *think* I screwed up copying and pasting the retpoline thunk.
>
> So, looking at the retpoline version...
>
> gsi_base is in %edi, and gets spilled to the stack at about .Ltmp22
> which is at line 412 right after the printk call:
>
> .Ltmp22:
>         addl    $12, %esp
>         movl    %edi, 12(%esp)          # 4-byte Spill
>
> At .Ltmp28 we then call __x86_indirect_thunk which *does* look like
> it's doing the right thing (and using the LLVM-emitted thunk instead of
> my own behaves the same; I don't think it's my copy-paste at fault).
>
> At .Ltmp29 we call bad_ioapic_register() and then when returns zero (it
> does) we je to .LBB0_10 aka .Ltmp34. At which point we happily stomp on
> the value of 12(%esp) by spilling something *else* over it:
>
> .LBB0_10:                               # %if.end28
> .Ltmp34:
>         #DEBUG_VALUE: mp_register_ioapic:address <- $esi
>         #DEBUG_VALUE: mp_register_ioapic:cfg <- [DW_OP_plus_uconst 8]
> [$ebp+0]
>         .loc    2 0 1 is_stmt 0         # arch/x86/kernel/apic/io_apic_
> b.c:0:1
>         movl    %ebx, 12(%esp)          # 4-byte Spill
>         movl    %edi, 20(%esp)          # 4-byte Spill
>
> Compare with the noretpoline.s version where those offsets from %esp
> are different:
>
> .LBB0_10:                               # %if.end28
> .Ltmp33:
>         #DEBUG_VALUE: mp_register_ioapic:gsi_base <- [DW_OP_plus_uconst
> 12] [$esp+0]
>         #DEBUG_VALUE: mp_register_ioapic:address <- $esi
>         #DEBUG_VALUE: mp_register_ioapic:cfg <- [DW_OP_plus_uconst 8]
> [$ebp+0]
>         .loc    2 0 1 is_stmt 0         # arch/x86/kernel/apic/io_apic_
> b.c:0:1
>         movl    %ebx, 8(%esp)           # 4-byte Spill
>         movl    %edi, 16(%esp)          # 4-byte Spill
>
>
> So... obviously the main preceding difference here is that the
> retpoline version did this for an indirect call:
>
>         movl    %esi, %edx
>         pushl   pv_mmu_ops+132
> .Ltmp28:
>         calll   __x86_indirect_thunk
>
> ... while the noretpoline version did this:
>
>         movl    %esi, %edx
>         calll   *pv_mmu_ops+132
> .Ltmp28:
>
> Which leaves me suspecting that when we push the target onto the stack
> for a retpoline call, that affects the %esp-based offsets of everything
> that's been spilled, of course... and perhaps we're not correctly
> adjusting those offsets *back* again by accounting for the fact that
> calling the thunk actually moves the stack pointer back up again?
>
> Breakpoint 1, mp_register_ioapic (id=0, address=4273995776, gsi_base=0,
> cfg=0xc1b2fe88 <init_thread_union+7816>) at arch/x86/kernel/apic/io_apic_
> b.c:389
> 389             bool hotplug = !!ioapic_initialized;
> 1: x/i $pc
> => 0xc10469c7 <mp_register_ioapic+23>:  mov    0xc1d36170,%eax
> 2: gsi_base = 0
> (gdb) ni
> 393             pr_warn("%s, %d %x %x %px\n", __func__, id, address,
> gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469cc <mp_register_ioapic+28>:  mov    %eax,0x18(%esp)
> 2: gsi_base = 0
> (gdb)
> 0xc10469d0      393             pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d0 <mp_register_ioapic+32>:  pushl  0x8(%ebp)
> 2: gsi_base = 0
> (gdb)
> 0xc10469d3      393             pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d3 <mp_register_ioapic+35>:  push   %ecx
> 2: gsi_base = 0
> (gdb)
> 0xc10469d4      393             pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d4 <mp_register_ioapic+36>:  push   %edx
> 2: gsi_base = 0
> (gdb)
> 0xc10469d5      393             pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d5 <mp_register_ioapic+37>:  push   %ebx
> 2: gsi_base = 0
> (gdb)
> 0xc10469d6      393             pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d6 <mp_register_ioapic+38>:  push   $0xc1a3f457
> 2: gsi_base = 0
> (gdb)
> 0xc10469db      393             pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469db <mp_register_ioapic+43>:  push   $0xc1a3f443
> 2: gsi_base = 0
> (gdb)
> 0xc10469e0      393             pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469e0 <mp_register_ioapic+48>:  call   0xc10aa840 <printk>
> 2: gsi_base = 0
> (gdb)
> 0xc10469e5      393             pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469e5 <mp_register_ioapic+53>:  add    $0x18,%esp
> 2: gsi_base = 0
> (gdb)
> 395             if (!address) {
> 1: x/i $pc
> => 0xc10469e8 <mp_register_ioapic+56>:  test   %esi,%esi
> 2: gsi_base = 0
> (gdb)
> 0xc10469ea      395             if (!address) {
> 1: x/i $pc
> => 0xc10469ea <mp_register_ioapic+58>:  je     0xc1046a34
> <mp_register_ioapic+132>
> 2: gsi_base = 0
> (gdb)
> 399             for_each_ioapic(ioapic)
> 1: x/i $pc
> => 0xc10469ec <mp_register_ioapic+60>:  mov    0xc1d36144,%eax
> 2: gsi_base = 0
> (gdb)
> 0xc10469f1      399             for_each_ioapic(ioapic)
> 1: x/i $pc
> => 0xc10469f1 <mp_register_ioapic+65>:  test   %eax,%eax
> 2: gsi_base = 0
> (gdb)
> 0xc10469f3      399             for_each_ioapic(ioapic)
> 1: x/i $pc
> => 0xc10469f3 <mp_register_ioapic+67>:  jle    0xc1046a10
> <mp_register_ioapic+96>
> 2: gsi_base = 0
> (gdb)
> 406             idx = find_free_ioapic_entry();
> 1: x/i $pc
> => 0xc1046a10 <mp_register_ioapic+96>:  call   0xc1045670
> <find_free_ioapic_entry>
> 2: gsi_base = 0
> (gdb)
> 407             if (idx >= MAX_IO_APICS) {
> 1: x/i $pc
> => 0xc1046a15 <mp_register_ioapic+101>: cmp    $0x40,%eax
> 2: gsi_base = 0
> (gdb)
> 0xc1046a18      407             if (idx >= MAX_IO_APICS) {
> 1: x/i $pc
> => 0xc1046a18 <mp_register_ioapic+104>: jl     0xc1046a4b
> <mp_register_ioapic+155>
> 2: gsi_base = 0
> (gdb)
> 0xc1046a4b      396                     pr_warn("Bogus (zero) I/O APIC
> address found, skipping!\n");
> 1: x/i $pc
> => 0xc1046a4b <mp_register_ioapic+155>: mov    %ebx,0x4(%esp)
> 2: gsi_base = 0
> (gdb)
> 412             pr_warn("At line %d, gsi_base is %d\n", __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a4f <mp_register_ioapic+159>: push   %edi
> 2: gsi_base = 0
> (gdb)
> 0xc1046a50      412             pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a50 <mp_register_ioapic+160>: push   $0x19c
> 2: gsi_base = 0
> (gdb)
> 0xc1046a55      412             pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a55 <mp_register_ioapic+165>: push   $0xc1a3f4fd
> 2: gsi_base = 0
> (gdb)
> 0xc1046a5a      412             pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a5a <mp_register_ioapic+170>: mov    %eax,%ebx
> 2: gsi_base = 0
> (gdb)
> 0xc1046a5c      412             pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a5c <mp_register_ioapic+172>: call   0xc10aa840 <printk>
> 2: gsi_base = 0
> (gdb)
> 0xc1046a61      412             pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a61 <mp_register_ioapic+177>: add    $0xc,%esp
> 2: gsi_base = 0
> (gdb)
> 0xc1046a64      412             pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a64 <mp_register_ioapic+180>: mov    %edi,0xc(%esp)
> 2: gsi_base = 0
> (gdb) p/x 0xc+$esp
> $1 = 0xc1b2fe34
> (gdb) disp *(int *)0xc1b2fe34
> 3: *(int *)0xc1b2fe34 = 120
> (gdb) ni
> 414             ioapics[idx].mp_config.type = MP_IOAPIC;
> 1: x/i $pc
> => 0xc1046a68 <mp_register_ioapic+184>: imul   $0x2c,%ebx,%edi
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046a6b      414             ioapics[idx].mp_config.type = MP_IOAPIC;
> 1: x/i $pc
> => 0xc1046a6b <mp_register_ioapic+187>: movb   $0x2,-0x3e2cb1bc(%edi)
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 415             ioapics[idx].mp_config.flags = MPC_APIC_USABLE;
> 1: x/i $pc
> => 0xc1046a72 <mp_register_ioapic+194>: movb   $0x1,-0x3e2cb1b9(%edi)
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 416             ioapics[idx].mp_config.apicaddr = address;
> 1: x/i $pc
> => 0xc1046a79 <mp_register_ioapic+201>: mov    %esi,-0x3e2cb1b8(%edi)
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 418             set_fixmap_nocache(FIX_IO_APIC_BASE_0 + idx, address);
> 1: x/i $pc
> => 0xc1046a7f <mp_register_ioapic+207>: lea    0x5(%ebx),%eax
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> cachemode2protval (pcm=_PAGE_CACHE_MODE_UC) at ./arch/x86/include/asm/
> pgtable_types.h:439
> 439             return __cachemode2pte_tbl[pcm];
> 1: x/i $pc
> => 0xc1046a82 <mp_register_ioapic+210>: movzwl 0xc1b424c6,%ecx
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> mp_register_ioapic (id=<optimized out>, address=4273995776, gsi_base=0,
> cfg=0xc1b2fe88 <init_thread_union+7816>) at arch/x86/kernel/apic/io_apic_
> b.c:418
> 418             set_fixmap_nocache(FIX_IO_APIC_BASE_0 + idx, address);
> 1: x/i $pc
> => 0xc1046a89 <mp_register_ioapic+217>: or     $0x163,%ecx
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046a8f      418             set_fixmap_nocache(FIX_IO_APIC_BASE_0 +
> idx, address);
> 1: x/i $pc
> => 0xc1046a8f <mp_register_ioapic+223>: mov    %eax,0x14(%esp)
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> __set_fixmap (idx=5, phys=<optimized out>, flags=...) at
> ./arch/x86/include/asm/paravirt.h:660
> 660             pv_mmu_ops.set_fixmap(idx, phys, flags);
> 1: x/i $pc
> => 0xc1046a93 <mp_register_ioapic+227>: mov    %esi,%edx
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046a95      660             pv_mmu_ops.set_fixmap(idx, phys, flags);
> 1: x/i $pc
> => 0xc1046a95 <mp_register_ioapic+229>: pushl  0xc1ad6434
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046a9b      660             pv_mmu_ops.set_fixmap(idx, phys, flags);
> 1: x/i $pc
> => 0xc1046a9b <mp_register_ioapic+235>: call   0xc18ff3a0
> <__x86_indirect_thunk>
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> mp_register_ioapic (id=<optimized out>, address=4273995776,
> gsi_base=<optimized out>, cfg=0xc1b2fe88 <init_thread_union+7816>) at
> arch/x86/kernel/apic/io_apic_b.c:419
> 419             if (bad_ioapic_register(idx)) {
> 1: x/i $pc
> => 0xc1046aa0 <mp_register_ioapic+240>: mov    %ebx,%eax
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046aa2      419             if (bad_ioapic_register(idx)) {
> 1: x/i $pc
> => 0xc1046aa2 <mp_register_ioapic+242>: call   0xc10455f0
> <bad_ioapic_register>
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046aa7      419             if (bad_ioapic_register(idx)) {
> 1: x/i $pc
> => 0xc1046aa7 <mp_register_ioapic+247>: test   %eax,%eax
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046aa9      419             if (bad_ioapic_register(idx)) {
> 1: x/i $pc
> => 0xc1046aa9 <mp_register_ioapic+249>: je     0xc1046ae1
> <mp_register_ioapic+305>
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046ae1      482     }
> 1: x/i $pc
> => 0xc1046ae1 <mp_register_ioapic+305>: mov    %ebx,0xc(%esp)
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb) p/x $esp+0xc
> $2 = 0xc1b2fe34
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180209/86e11261/attachment-0001.html>


More information about the llvm-dev mailing list