[LLVMbugs] [Bug 13320] New: unnecessary use of lea to increment an induction variable

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Tue Jul 10 09:21:40 PDT 2012


http://llvm.org/bugs/show_bug.cgi?id=13320

             Bug #: 13320
           Summary: unnecessary use of lea to increment an induction
                    variable
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: rafael.espindola at gmail.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified


gcc-4.2 compiles

long IndexOfChild(long* children, long d, long count)  {
  long i = -1;
  for (; ;) {
    if (i >= (count -1))
      break;
    ++i;
    if (children[i] == d) {
      return i;
    }
  }
  return -1;
}


to

0000000000000000    decq    %rdx
0000000000000003    movq    $0xffffffff,%rax
000000000000000a    nopw    0x00(%rax,%rax)
0000000000000010    cmpq    %rdx,%rax
0000000000000013    jge    0x00000020
0000000000000015    incq    %rax
0000000000000018    cmpq    %rsi,(%rdi,%rax,8)
000000000000001c    jne    0x00000010
000000000000001e    repz/ret
0000000000000020    movq    $0xffffffff,%rax
0000000000000027    ret


Clang produces

0000000000000000    decq    %rdx
0000000000000003    movq    $0xffffffff,%rcx
000000000000000a    movq    %rcx,%rax
000000000000000d    nopl    (%rax)
0000000000000010    cmpq    %rdx,%rax
0000000000000013    jge    0x00000021
0000000000000015    cmpq    %rsi,0x08(%rdi,%rax,8)
000000000000001a    leaq    0x01(%rax),%rax
000000000000001e    jne    0x00000010
0000000000000020    ret
0000000000000021    movq    %rcx,%rax
0000000000000024    ret

The use of an extra register (rcx) is probably not a big issue, as it is note
used on the loop. The strange part is that clang inverts the increment of 'i'
and the load and then uses a leaq instead of an inc to avoid modifying the
flags.


To test this I wrapped it with


#include <string.h>
#include <stdio.h>
long IndexOfChild(long* children, long d, long count) ;
int main() {
  long n = 1000000;
  long *v = new long[n];
  memset(v, 0, n * sizeof(long));
  v[n - 1] = 42;
  int r;
  for (int i = 0; i < 1000; ++i)
    r = IndexOfChild(v, 42, n);
  printf("%d\n", r);
  return 0;
}

If this is a perf problem or not seems to be cpu dependent. On my MacBookPro6,2
with a 2.66 GHz core i7 I get 1.243s for the clang version and 1.173 for the
gcc version.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list