[llvm-dev] retpoline mitigation and 6.0
Reid Kleckner via llvm-dev
llvm-dev at lists.llvm.org
Fri Feb 9 11:24:26 PST 2018
I haven't read the all the emails in full detail, but it seems pretty clear
that __x86_indirect_thunk and __llvm_retpoline_push do not do the same
things. It sounds like __llvm_retpoline_push is equivalent to
__x86_indirect_thunk except first it swaps the two words on the top of the
stack.
I arranged it this way because the x86 call instruction puts the intended
return address on the top of the stack, and there's no easy way to put it
anywhere else. We use this thunk when we want to make an indirect call and
there are no available scratch registers, i.e. 32-bit -mregparm=3 and the
call has three or more arguments, which happens in Linux. One way to avoid
this would be to compile with -mregparm=2, but that would pessimize direct
calls unnecessarily.
It sounds like Linux is already providing its own thunks, so it might be
better to for us to go back to the __llvm_retpoline_push external thunk
name and then get Linux to provide that thunk ifdef __clang__.
On Fri, Feb 9, 2018 at 7:37 AM, David Woodhouse via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Fri, 2018-02-09 at 12:54 +0000, David Woodhouse wrote:
> > On Fri, 2018-02-09 at 10:36 +0000, David Woodhouse wrote:
> > >
> > > Did you get anywhere with the function attribute? Having isolated the
> > > next boot failure to "it goes away if I compile io_apic.c without
> > > retpoline", bisecting it per-function would help to further delay the
> > > bit where I actually have to start *thinking*...
> >
> > It's mp_register_ioapic(), and only when io_apic_unique_id() gets
> > inlined, at which point bad things start happening.
> >
> > [ 0.000000] mp_register_ioapic, 0 fec00000 0 c1b2fe88
> > [ 0.000000] At line 412, gsi_base is 0
> > [ 0.000000] At line 425, gsi_base is -1043715332
> > [ 0.000000] At line 427, gsi_base is -1043715332
> >
> > http://git.infradead.org/users/dwmw2/linux-retpoline.git/shortlog/ref
> > s/heads/clang
> > http://david.woodhou.se/clang32.config
> >
> >
> > http://david.woodhou.se/io_apic_b.i
> > http://david.woodhou.se/io_apic_b.noretpoline.s
> > http://david.woodhou.se/io_apic_b.retpoline.s
> >
> > I don't *think* I screwed up copying and pasting the retpoline thunk.
>
> So, looking at the retpoline version...
>
> gsi_base is in %edi, and gets spilled to the stack at about .Ltmp22
> which is at line 412 right after the printk call:
>
> .Ltmp22:
> addl $12, %esp
> movl %edi, 12(%esp) # 4-byte Spill
>
> At .Ltmp28 we then call __x86_indirect_thunk which *does* look like
> it's doing the right thing (and using the LLVM-emitted thunk instead of
> my own behaves the same; I don't think it's my copy-paste at fault).
>
> At .Ltmp29 we call bad_ioapic_register() and then when returns zero (it
> does) we je to .LBB0_10 aka .Ltmp34. At which point we happily stomp on
> the value of 12(%esp) by spilling something *else* over it:
>
> .LBB0_10: # %if.end28
> .Ltmp34:
> #DEBUG_VALUE: mp_register_ioapic:address <- $esi
> #DEBUG_VALUE: mp_register_ioapic:cfg <- [DW_OP_plus_uconst 8]
> [$ebp+0]
> .loc 2 0 1 is_stmt 0 # arch/x86/kernel/apic/io_apic_
> b.c:0:1
> movl %ebx, 12(%esp) # 4-byte Spill
> movl %edi, 20(%esp) # 4-byte Spill
>
> Compare with the noretpoline.s version where those offsets from %esp
> are different:
>
> .LBB0_10: # %if.end28
> .Ltmp33:
> #DEBUG_VALUE: mp_register_ioapic:gsi_base <- [DW_OP_plus_uconst
> 12] [$esp+0]
> #DEBUG_VALUE: mp_register_ioapic:address <- $esi
> #DEBUG_VALUE: mp_register_ioapic:cfg <- [DW_OP_plus_uconst 8]
> [$ebp+0]
> .loc 2 0 1 is_stmt 0 # arch/x86/kernel/apic/io_apic_
> b.c:0:1
> movl %ebx, 8(%esp) # 4-byte Spill
> movl %edi, 16(%esp) # 4-byte Spill
>
>
> So... obviously the main preceding difference here is that the
> retpoline version did this for an indirect call:
>
> movl %esi, %edx
> pushl pv_mmu_ops+132
> .Ltmp28:
> calll __x86_indirect_thunk
>
> ... while the noretpoline version did this:
>
> movl %esi, %edx
> calll *pv_mmu_ops+132
> .Ltmp28:
>
> Which leaves me suspecting that when we push the target onto the stack
> for a retpoline call, that affects the %esp-based offsets of everything
> that's been spilled, of course... and perhaps we're not correctly
> adjusting those offsets *back* again by accounting for the fact that
> calling the thunk actually moves the stack pointer back up again?
>
> Breakpoint 1, mp_register_ioapic (id=0, address=4273995776, gsi_base=0,
> cfg=0xc1b2fe88 <init_thread_union+7816>) at arch/x86/kernel/apic/io_apic_
> b.c:389
> 389 bool hotplug = !!ioapic_initialized;
> 1: x/i $pc
> => 0xc10469c7 <mp_register_ioapic+23>: mov 0xc1d36170,%eax
> 2: gsi_base = 0
> (gdb) ni
> 393 pr_warn("%s, %d %x %x %px\n", __func__, id, address,
> gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469cc <mp_register_ioapic+28>: mov %eax,0x18(%esp)
> 2: gsi_base = 0
> (gdb)
> 0xc10469d0 393 pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d0 <mp_register_ioapic+32>: pushl 0x8(%ebp)
> 2: gsi_base = 0
> (gdb)
> 0xc10469d3 393 pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d3 <mp_register_ioapic+35>: push %ecx
> 2: gsi_base = 0
> (gdb)
> 0xc10469d4 393 pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d4 <mp_register_ioapic+36>: push %edx
> 2: gsi_base = 0
> (gdb)
> 0xc10469d5 393 pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d5 <mp_register_ioapic+37>: push %ebx
> 2: gsi_base = 0
> (gdb)
> 0xc10469d6 393 pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469d6 <mp_register_ioapic+38>: push $0xc1a3f457
> 2: gsi_base = 0
> (gdb)
> 0xc10469db 393 pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469db <mp_register_ioapic+43>: push $0xc1a3f443
> 2: gsi_base = 0
> (gdb)
> 0xc10469e0 393 pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469e0 <mp_register_ioapic+48>: call 0xc10aa840 <printk>
> 2: gsi_base = 0
> (gdb)
> 0xc10469e5 393 pr_warn("%s, %d %x %x %px\n", __func__,
> id, address, gsi_base, cfg);
> 1: x/i $pc
> => 0xc10469e5 <mp_register_ioapic+53>: add $0x18,%esp
> 2: gsi_base = 0
> (gdb)
> 395 if (!address) {
> 1: x/i $pc
> => 0xc10469e8 <mp_register_ioapic+56>: test %esi,%esi
> 2: gsi_base = 0
> (gdb)
> 0xc10469ea 395 if (!address) {
> 1: x/i $pc
> => 0xc10469ea <mp_register_ioapic+58>: je 0xc1046a34
> <mp_register_ioapic+132>
> 2: gsi_base = 0
> (gdb)
> 399 for_each_ioapic(ioapic)
> 1: x/i $pc
> => 0xc10469ec <mp_register_ioapic+60>: mov 0xc1d36144,%eax
> 2: gsi_base = 0
> (gdb)
> 0xc10469f1 399 for_each_ioapic(ioapic)
> 1: x/i $pc
> => 0xc10469f1 <mp_register_ioapic+65>: test %eax,%eax
> 2: gsi_base = 0
> (gdb)
> 0xc10469f3 399 for_each_ioapic(ioapic)
> 1: x/i $pc
> => 0xc10469f3 <mp_register_ioapic+67>: jle 0xc1046a10
> <mp_register_ioapic+96>
> 2: gsi_base = 0
> (gdb)
> 406 idx = find_free_ioapic_entry();
> 1: x/i $pc
> => 0xc1046a10 <mp_register_ioapic+96>: call 0xc1045670
> <find_free_ioapic_entry>
> 2: gsi_base = 0
> (gdb)
> 407 if (idx >= MAX_IO_APICS) {
> 1: x/i $pc
> => 0xc1046a15 <mp_register_ioapic+101>: cmp $0x40,%eax
> 2: gsi_base = 0
> (gdb)
> 0xc1046a18 407 if (idx >= MAX_IO_APICS) {
> 1: x/i $pc
> => 0xc1046a18 <mp_register_ioapic+104>: jl 0xc1046a4b
> <mp_register_ioapic+155>
> 2: gsi_base = 0
> (gdb)
> 0xc1046a4b 396 pr_warn("Bogus (zero) I/O APIC
> address found, skipping!\n");
> 1: x/i $pc
> => 0xc1046a4b <mp_register_ioapic+155>: mov %ebx,0x4(%esp)
> 2: gsi_base = 0
> (gdb)
> 412 pr_warn("At line %d, gsi_base is %d\n", __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a4f <mp_register_ioapic+159>: push %edi
> 2: gsi_base = 0
> (gdb)
> 0xc1046a50 412 pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a50 <mp_register_ioapic+160>: push $0x19c
> 2: gsi_base = 0
> (gdb)
> 0xc1046a55 412 pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a55 <mp_register_ioapic+165>: push $0xc1a3f4fd
> 2: gsi_base = 0
> (gdb)
> 0xc1046a5a 412 pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a5a <mp_register_ioapic+170>: mov %eax,%ebx
> 2: gsi_base = 0
> (gdb)
> 0xc1046a5c 412 pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a5c <mp_register_ioapic+172>: call 0xc10aa840 <printk>
> 2: gsi_base = 0
> (gdb)
> 0xc1046a61 412 pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a61 <mp_register_ioapic+177>: add $0xc,%esp
> 2: gsi_base = 0
> (gdb)
> 0xc1046a64 412 pr_warn("At line %d, gsi_base is %d\n",
> __LINE__,gsi_base);
> 1: x/i $pc
> => 0xc1046a64 <mp_register_ioapic+180>: mov %edi,0xc(%esp)
> 2: gsi_base = 0
> (gdb) p/x 0xc+$esp
> $1 = 0xc1b2fe34
> (gdb) disp *(int *)0xc1b2fe34
> 3: *(int *)0xc1b2fe34 = 120
> (gdb) ni
> 414 ioapics[idx].mp_config.type = MP_IOAPIC;
> 1: x/i $pc
> => 0xc1046a68 <mp_register_ioapic+184>: imul $0x2c,%ebx,%edi
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046a6b 414 ioapics[idx].mp_config.type = MP_IOAPIC;
> 1: x/i $pc
> => 0xc1046a6b <mp_register_ioapic+187>: movb $0x2,-0x3e2cb1bc(%edi)
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 415 ioapics[idx].mp_config.flags = MPC_APIC_USABLE;
> 1: x/i $pc
> => 0xc1046a72 <mp_register_ioapic+194>: movb $0x1,-0x3e2cb1b9(%edi)
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 416 ioapics[idx].mp_config.apicaddr = address;
> 1: x/i $pc
> => 0xc1046a79 <mp_register_ioapic+201>: mov %esi,-0x3e2cb1b8(%edi)
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 418 set_fixmap_nocache(FIX_IO_APIC_BASE_0 + idx, address);
> 1: x/i $pc
> => 0xc1046a7f <mp_register_ioapic+207>: lea 0x5(%ebx),%eax
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> cachemode2protval (pcm=_PAGE_CACHE_MODE_UC) at ./arch/x86/include/asm/
> pgtable_types.h:439
> 439 return __cachemode2pte_tbl[pcm];
> 1: x/i $pc
> => 0xc1046a82 <mp_register_ioapic+210>: movzwl 0xc1b424c6,%ecx
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> mp_register_ioapic (id=<optimized out>, address=4273995776, gsi_base=0,
> cfg=0xc1b2fe88 <init_thread_union+7816>) at arch/x86/kernel/apic/io_apic_
> b.c:418
> 418 set_fixmap_nocache(FIX_IO_APIC_BASE_0 + idx, address);
> 1: x/i $pc
> => 0xc1046a89 <mp_register_ioapic+217>: or $0x163,%ecx
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046a8f 418 set_fixmap_nocache(FIX_IO_APIC_BASE_0 +
> idx, address);
> 1: x/i $pc
> => 0xc1046a8f <mp_register_ioapic+223>: mov %eax,0x14(%esp)
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> __set_fixmap (idx=5, phys=<optimized out>, flags=...) at
> ./arch/x86/include/asm/paravirt.h:660
> 660 pv_mmu_ops.set_fixmap(idx, phys, flags);
> 1: x/i $pc
> => 0xc1046a93 <mp_register_ioapic+227>: mov %esi,%edx
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046a95 660 pv_mmu_ops.set_fixmap(idx, phys, flags);
> 1: x/i $pc
> => 0xc1046a95 <mp_register_ioapic+229>: pushl 0xc1ad6434
> 2: gsi_base = 0
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046a9b 660 pv_mmu_ops.set_fixmap(idx, phys, flags);
> 1: x/i $pc
> => 0xc1046a9b <mp_register_ioapic+235>: call 0xc18ff3a0
> <__x86_indirect_thunk>
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> mp_register_ioapic (id=<optimized out>, address=4273995776,
> gsi_base=<optimized out>, cfg=0xc1b2fe88 <init_thread_union+7816>) at
> arch/x86/kernel/apic/io_apic_b.c:419
> 419 if (bad_ioapic_register(idx)) {
> 1: x/i $pc
> => 0xc1046aa0 <mp_register_ioapic+240>: mov %ebx,%eax
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046aa2 419 if (bad_ioapic_register(idx)) {
> 1: x/i $pc
> => 0xc1046aa2 <mp_register_ioapic+242>: call 0xc10455f0
> <bad_ioapic_register>
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046aa7 419 if (bad_ioapic_register(idx)) {
> 1: x/i $pc
> => 0xc1046aa7 <mp_register_ioapic+247>: test %eax,%eax
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046aa9 419 if (bad_ioapic_register(idx)) {
> 1: x/i $pc
> => 0xc1046aa9 <mp_register_ioapic+249>: je 0xc1046ae1
> <mp_register_ioapic+305>
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb)
> 0xc1046ae1 482 }
> 1: x/i $pc
> => 0xc1046ae1 <mp_register_ioapic+305>: mov %ebx,0xc(%esp)
> 2: gsi_base = <optimized out>
> 3: *(int *)0xc1b2fe34 = 0
> (gdb) p/x $esp+0xc
> $2 = 0xc1b2fe34
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180209/86e11261/attachment-0001.html>
More information about the llvm-dev
mailing list