[LLVMbugs] [Bug 3707] New: Inefficient loop codegen

bugzilla-daemon at cs.uiuc.edu bugzilla-daemon at cs.uiuc.edu
Tue Mar 3 04:57:08 PST 2009


http://llvm.org/bugs/show_bug.cgi?id=3707

           Summary: Inefficient loop codegen
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Backend: X86
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: jturner at minnow-lang.org
                CC: llvmbugs at cs.uiuc.edu


Compiling this with the llvm-gcc toolchain (through opt and llc) :

include <stdio.h>

int main() {
  int loop = 1000000000;
  int timeout;

timeoutloop:
  timeout = 2000;
  /* asm("nop;"); */
loopto:
  if (--timeout == 0) goto timeoutloop;
  if (--loop != 0) goto loopto;

  printf("Timeout: %i\n", timeout);
  return 0;
}

Yields this asm as output (I'm using OS X):

        .text
        .align  4,0x90
        .globl  _main
_main:
        subl    $12, %esp
        movl    $1999, %eax
        xorl    %ecx, %ecx
        movl    $1999, %edx
        .align  4,0x90
LBB1_1: ## loopto
        cmpl    $1, %eax
        leal    -1(%eax), %eax
        cmove   %edx, %eax
        incl    %ecx
        cmpl    $999999999, %ecx
        jne     LBB1_1  ## loopto
LBB1_2: ## bb1
        movl    %eax, 4(%esp)
        movl    $LC, (%esp)
        call    _printf
        xorl    %eax, %eax
        addl    $12, %esp
        ret
        .section __TEXT,__cstring,cstring_literals
LC:                             ## LC
        .asciz  "Timeout: %i\n"

        .subsections_via_symbols

Which runs in 1.7s on this machine.

Uncommenting the 'asm("nop")' in the C code above instead yields this output:

        .text
        .align  4,0x90
        .globl  _main
_main:
        subl    $12, %esp
        movl    $1000000000, %eax
        .align  4,0x90
LBB1_1: ## loopto.thread
        movl    %eax, %ecx
        ## InlineAsm Start
        nop;
        ## InlineAsm End
        movl    $4294967295, %edx
        jmp     LBB1_3  ## bb
LBB1_2: ## loopto
        decl    %eax
        incl    %edx
        cmpl    $1998, %edx
        je      LBB1_1  ## loopto.thread
LBB1_3: ## bb
        cmpl    $1, %eax
        jne     LBB1_2  ## loopto
LBB1_4: ## bb1
        subl    %ecx, %eax
        addl    $1999, %eax
        movl    %eax, 4(%esp)
        movl    $LC, (%esp)
        call    _printf
        xorl    %eax, %eax
        addl    $12, %esp
        ret
        .section __TEXT,__cstring,cstring_literals
LC:                             ## LC
        .asciz  "Timeout: %i\n"

        .subsections_via_symbols


Which runs in 1.0s.

The trivialized loop runs slower than the non-trivialized one.  Evan Chang
points out on the LLVM mailing list:

"The main issue is incl updates the EFLAGS condition code register. But  
llvm x86 isn't taking advantage of that. This is a known issue,  
hopefully someone will find the time to implement before 2.6.

The second issue is the leal -1 can be turned (back) into a decl.  
Combine that with the optimization previously described, it can  
eliminate the first cmpl."

Another possibility is the use of cmove in this case is slower than a jz to a
branch that resets %eax.  Modifying the original asm source above:

        .text
        .align  4,0x90
        .globl  _main
_main:
        subl    $12, %esp
        movl    $1999, %eax
        xorl    %ecx, %ecx
        movl    $1999, %edx
        jmp LBB1_1
        .align  4,0x90
LBB1_3:
        movl    %edx, %eax
        jmp     LBB1_4
LBB1_1: ## loopto
        cmpl    $1, %eax
        leal    -1(%eax), %eax
        jz      LBB1_3
LBB1_4:
        incl    %ecx
        cmpl    $999999999, %ecx
        jnz     LBB1_1  ## loopto
        jmp     LBB1_2
LBB1_2: ## bb1
        movl    %eax, 4(%esp)
        movl    $LC, (%esp)
        call    _printf
        xorl    %eax, %eax
        addl    $12, %esp
        ret
        .section __TEXT,__cstring,cstring_literals
LC:                             ## LC
        .asciz  "Timeout: %i\n"

        .subsections_via_symbols

Which also runs in 1.0s.


-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list