[LLVMdev] Tight overlapping loops and performance

Mon Mar 2 11:38:22 PST 2009

I was playing around in x86 assembly the other day, looking at ways to optimize my cooperative multitasking system.  Currently, it uses a 'timeout' counter that is decremented each time through a loop, letting me stop the loop and go to the next cooperative thread if the loop runs a little long.

The asm has two overlapping loops:

---
_main:
        mov ecx, 1000000000
timeoutloop:
        mov edx, 2000
loopto:
        dec edx
        jz timeoutloop
        dec ecx
        jnz loopto
        mov eax, 0
        ret
---

Which takes 1.0s on my machine.

To compare, I wanted to see what LLVM performance was like and if a similar technique would yield good performance.  I cooked up this test in C:

---
#include <stdio.h>

int main() {
  int loop = 1000000000;
  int timeout;

timeoutloop:
  timeout = 2000;
loopto:
  if (--timeout == 0) goto timeoutloop;
  if (--loop != 0) goto loopto;

  printf("Timeout: %i\n", timeout);

  return 0;
}
---

With gcc -O3 4.2 and 4.4 we match 1.0s.   The LLVM, after running it through opt -std-compile-opts, is around 1.7s. 

Should I be looking at any particular optimization passes that aren't in -std-compile-opts to match the gcc speeds?

Thanks,

Jonathan

_________________________________________________________________
Windows Live™ Groups: Create an online spot for your favorite groups to meet.
http://windowslive.com/online/groups?ocid=TXT_TAGLM_WL_groups_032009
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090302/466f154b/attachment.html>