[LLVMbugs] [Bug 935] NEW: Loop optimization deficiencies

Tue Oct 3 11:47:01 PDT 2006

http://llvm.org/bugs/show_bug.cgi?id=935

           Summary: Loop optimization deficiencies
           Product: libraries
           Version: 1.5
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Scalar Optimizations
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: sabre at nondot.org

Consider these nested loops:

void foo(unsigned char *ptr, unsigned r, unsigned g, unsigned b, unsigned h, unsigned w) {
unsigned row, col;

for (row = 0; row < h; row++) {
for (col = 0; col < w; col += 3) { 
ptr[col] = r;
ptr[col+1] = g;
ptr[col+2] = b;
}
}
}

We currently don't rotate the inner loop (bad), though this produces decent X86 code for the inner loop:

        xorl %ebx, %ebx
        jmp LBB1_4      #bb16
LBB1_3: #bb1
        movb %cl, (%edi,%ebx)
        movb %ah, 1(%edi,%ebx)
        movb %al, 2(%edi,%ebx)
        addl $3, %ebx
LBB1_4: #bb16
        cmpl %edx, %ebx
        jb LBB1_3       #bb1

For PowerPC though, we're getting an extra IV:

        li r9, 0
        b LBB1_4        ;bb16
LBB1_3: ;bb1
        stbx r4, r3, r9
        add r10, r3, r9
        addi r9, r9, 3
        stb r5, 1(r10)
        stb r6, 2(r10)
LBB1_4: ;bb16
        cmplw cr0, r9, r8
        blt cr0, LBB1_3 ;bb1

Further, the use of postinc addressing modes would eliminate the need for *any* adds in the loop.

-Chris

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.