<html>

    <head>

      <base href="http://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - sign extensions cause suboptimal pointer arithmetic?"

   href="http://llvm.org/bugs/show_bug.cgi?id=20134">20134</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>sign extensions cause suboptimal pointer arithmetic?

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Scalar Optimizations

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>spatel+llvm@rotateright.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvmbugs@cs.uiuc.edu

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Consider this program:

void foo(int *a, int i) {

    a[i] = a[i+1] + a[i+2];

}

----------------------------------

Or as slightly optimized LLVM IR for a 64-bit system:

define void @foo(i32* nocapture %a, i32 %i) #0 {

entry:

  %add = add nsw i32 %i, 1

  %idxprom = sext i32 %add to i64

  %arrayidx = getelementptr inbounds i32* %a, i64 %idxprom

  %0 = load i32* %arrayidx, align 4

  %add1 = add nsw i32 %i, 2

  %idxprom2 = sext i32 %add1 to i64

  %arrayidx3 = getelementptr inbounds i32* %a, i64 %idxprom2

  %1 = load i32* %arrayidx3, align 4

  %add4 = add nsw i32 %1, %0

  %idxprom5 = sext i32 %i to i64

  %arrayidx6 = getelementptr inbounds i32* %a, i64 %idxprom5

  store i32 %add4, i32* %arrayidx6, align 4

  ret void

}

-----------------------------------

When compiled for x86-64 with r211521, we get:

_foo:

00    leal    0x1(%rsi), %eax

03    cltq                         <--- sign extend

05    leal    0x2(%rsi), %ecx      

08    movslq    %ecx, %rcx           <--- sign extend

0b    movl    (%rdi,%rcx,4), %ecx

0e    addl    (%rdi,%rax,4), %ecx

11    movslq    %esi, %rax           <--- sign extend

14    movl    %ecx, (%rdi,%rax,4)

17    ret

Is it possible to recognize that 'i' is being sign extended after multiple math

ops, move the sign extend ahead of those math ops, and do those math ops in

64-bit?

If we could do that, I think we would produce the optimal codegen:

_foo:

00    movslq    %edx, %rdx

03    movl    0x4(%rdi,%rdx,4), %eax

07    addl    0x8(%rdi,%rdx,4), %eax

0b    movl    %eax, (%rdi,%rdx,4)

0e    ret

This code is faster and 35% smaller (15/23 bytes)...and this is what gcc 4.9

produces at -O1.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>