[LLVMbugs] [Bug 20134] New: sign extensions cause suboptimal pointer arithmetic?

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Thu Jun 26 09:39:39 PDT 2014


http://llvm.org/bugs/show_bug.cgi?id=20134

            Bug ID: 20134
           Summary: sign extensions cause suboptimal pointer arithmetic?
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: spatel+llvm at rotateright.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Consider this program:

void foo(int *a, int i) {
    a[i] = a[i+1] + a[i+2];
}

----------------------------------

Or as slightly optimized LLVM IR for a 64-bit system:

define void @foo(i32* nocapture %a, i32 %i) #0 {
entry:
  %add = add nsw i32 %i, 1
  %idxprom = sext i32 %add to i64
  %arrayidx = getelementptr inbounds i32* %a, i64 %idxprom
  %0 = load i32* %arrayidx, align 4
  %add1 = add nsw i32 %i, 2
  %idxprom2 = sext i32 %add1 to i64
  %arrayidx3 = getelementptr inbounds i32* %a, i64 %idxprom2
  %1 = load i32* %arrayidx3, align 4
  %add4 = add nsw i32 %1, %0
  %idxprom5 = sext i32 %i to i64
  %arrayidx6 = getelementptr inbounds i32* %a, i64 %idxprom5
  store i32 %add4, i32* %arrayidx6, align 4
  ret void
}

-----------------------------------

When compiled for x86-64 with r211521, we get:

_foo:
00    leal    0x1(%rsi), %eax
03    cltq                         <--- sign extend
05    leal    0x2(%rsi), %ecx      
08    movslq    %ecx, %rcx           <--- sign extend
0b    movl    (%rdi,%rcx,4), %ecx
0e    addl    (%rdi,%rax,4), %ecx
11    movslq    %esi, %rax           <--- sign extend
14    movl    %ecx, (%rdi,%rax,4)
17    ret

Is it possible to recognize that 'i' is being sign extended after multiple math
ops, move the sign extend ahead of those math ops, and do those math ops in
64-bit?

If we could do that, I think we would produce the optimal codegen:

_foo:
00    movslq    %edx, %rdx
03    movl    0x4(%rdi,%rdx,4), %eax
07    addl    0x8(%rdi,%rdx,4), %eax
0b    movl    %eax, (%rdi,%rdx,4)
0e    ret

This code is faster and 35% smaller (15/23 bytes)...and this is what gcc 4.9
produces at -O1.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20140626/3de2a804/attachment.html>


More information about the llvm-bugs mailing list