[LLVMdev] Tight overlapping loops and performance
Jonathan Turner
probata at hotmail.com
Mon Mar 2 16:58:16 PST 2009
> You're misreading the asm... nothing is touching memory. (BTW, "leal
> -1(%eax), %eax" isn't a memory operation; it's just subtracting one
> from %eax.) You might want to try reading the LLVM IR (which you can
> generate with llvm-gcc -S -emit-llvm); it tends to be easier to read.
I tried that, but I'm still learning LLVM. Seeing indvar, phi nodes, tail
calls on printfs, and nounwinds had me more confused than the asm.
> A taken and non-taken branch have roughly the same cost on any
> remotely recent x86 processor.
I was wondering if that might be the case.
The crux of the example still seems intact. From LLVM SVN, converted to asm via llc:
.text
.align 4,0x90
.globl _main
_main:
subl $12, %esp
movl $1999, %eax
xorl %ecx, %ecx
movl $1999, %edx
.align 4,0x90
LBB1_1: ## loopto
cmpl $1, %eax
leal -1(%eax), %eax
cmove %edx, %eax
incl %ecx
cmpl $999999999, %ecx
jne LBB1_1 ## loopto
LBB1_2: ## bb1
movl %eax, 4(%esp)
movl $LC, (%esp)
call _printf
xorl %eax, %eax
addl $12, %esp
ret
.section __TEXT,__cstring,cstring_literals
LC: ## LC
.asciz "Timeout: %i\n"
.subsections_via_symbols
Setting the loops to decl instead of cmove/incl might seem like more work, but appears to be faster:
.text
.align 4,0x90
.globl _main
_main:
subl $12, %esp
movl $2000, %eax
movl $1000000000, %ecx
.align 4,0x90
LBB1_3:
movl $2000, %eax
LBB1_1: ## loopto
decl %eax
jz LBB1_3
decl %ecx
jnz LBB1_1 ## loopto
LBB1_2: ## bb1
movl %eax, 4(%esp)
movl $LC, (%esp)
call _printf
xorl %eax, %eax
addl $12, %esp
ret
.section __TEXT,__cstring,cstring_literals
LC: ## LC
.asciz "Timeout: %i\n"
.subsections_via_symbols
The first example is 1.7s, the second is 1.0s. That's on my dual core OS X box. I have a 2-processor quad-core Xeon box that runs Linux and also has very similar results.
Jonathan
_________________________________________________________________
Windows Liveā¢ Contacts: Organize your contact list.
http://windowslive.com/connect/post/marcusatmicrosoft.spaces.live.com-Blog-cns!503D1D86EBB2B53C!2285.entry?ocid=TXT_TAGLM_WL_UGC_Contacts_032009
More information about the llvm-dev
mailing list