[cfe-dev] variable length argument functions in AMD64 arch

Zhi Wang beiyuw at gmail.com
Thu Aug 13 13:26:50 PDT 2009


Hi, all

I am trying to use clang to compile a small OS kernel. The variable
length argument function
gives me much headache. For example, a printk function is defined in
my kernel as:

int printk (const char *format, ...)
{
        return 0;
}

This printk will be compiled into  (objdumped, AT&T syntax):
4011e2b0 <printk>:
4011e2b0:       55                      push   %ebp
4011e2b1:       48                      dec    %eax
4011e2b2:       89 e5                   mov    %esp,%ebp
4011e2b4:       48                      dec    %eax
4011e2b5:       81 ec b0 00 00 00       sub    $0xb0,%esp
4011e2bb:       0f 29 7d f0             movaps %xmm7,-0x10(%ebp)
4011e2bf:       0f 29 75 e0             movaps %xmm6,-0x20(%ebp)
4011e2c3:       0f 29 6d d0             movaps %xmm5,-0x30(%ebp)
4011e2c7:       0f 29 65 c0             movaps %xmm4,-0x40(%ebp)
4011e2cb:       0f 29 5d b0             movaps %xmm3,-0x50(%ebp)
4011e2cf:       0f 29 55 a0             movaps %xmm2,-0x60(%ebp)
4011e2d3:       0f 29 4d 90             movaps %xmm1,-0x70(%ebp)
4011e2d7:       0f 29 45 80             movaps %xmm0,-0x80(%ebp)
4011e2db:       4c                      dec    %esp
4011e2dc:       89 8d 78 ff ff ff       mov    %ecx,-0x88(%ebp)
4011e2e2:       4c                      dec    %esp
4011e2e3:       89 85 70 ff ff ff       mov    %eax,-0x90(%ebp)
4011e2e9:       48                      dec    %eax
4011e2ea:       89 8d 68 ff ff ff       mov    %ecx,-0x98(%ebp)
4011e2f0:       48                      dec    %eax
4011e2f1:       89 95 60 ff ff ff       mov    %edx,-0xa0(%ebp)
4011e2f7:       48                      dec    %eax
4011e2f8:       89 b5 58 ff ff ff       mov    %esi,-0xa8(%ebp)
4011e2fe:       31 c0                   xor    %eax,%eax
4011e300:       48                      dec    %eax
4011e301:       81 c4 b0 00 00 00       add    $0xb0,%esp
4011e307:       5d                      pop    %ebp
4011e308:       c3                      ret

It seems clang will generate code to handle variable length arguments
no matter whether
va_xxx (va_start, va_end) is used or not. (gcc will only generate code
to handle varible
length arguments when va_start is used).

My biggest issue with this code is that movaps is used. According to
Intel's manual,
if the destination memory isn't 16-byte aligned, a GP# (General
Protection fault) will occur.
It seems that using movaps is wrong unless we can guarantee that ebp
is always 16byte aligned.
This may not be true. I manually edited the binary of generated code
to use the movups
(the same instruction as movups except that it will not check the
alignment). instruction
and everything is fine.


Any comments?

--Zhi



More information about the cfe-dev mailing list