<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - [x86] hoist complex address generation common subexpressions?"
   href="https://llvm.org/bugs/show_bug.cgi?id=25261">25261</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[x86] hoist complex address generation common subexpressions?
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>spatel+llvm@rotateright.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>There's a potential optimization remaining from the test case in <a class="bz_bug_link 
          bz_status_RESOLVED  bz_closed"
   title="RESOLVED FIXED - sign extensions cause suboptimal pointer arithmetic?"
   href="show_bug.cgi?id=20134">bug 20134</a>:

void foo(int *a, int i) {
    a[i] = a[i+1] + a[i+2];
}

After:
<a href="http://llvm.org/viewvc/llvm-project?view=revision&revision=250560">http://llvm.org/viewvc/llvm-project?view=revision&revision=250560</a>

We generate:
00    movslq   %esi, %rax
03    movl     0x4(%rdi,%rax,4), %ecx
07    addl     0x8(%rdi,%rax,4), %ecx
0b    movl     %ecx, (%rdi,%rax,4)
0e    retq

We could hoist the common factor (%rdi,%rsi,4) out of the load and store
addresses:

00    movslq   %esi, %rax
03    leaq     (%rdi,%rax,4), %rax
07    movl     0x4(%rax), %ecx
0a    addl     0x8(%rax), %ecx
0d    movl     %ecx, (%rax)
0f    retq

This probably requires simulation and micro-benchmarking across
micro-architectures to decide if it's worthwhile. The LEA option appears to
increase code size by 1 byte (hand-hacked asm - didn't verify that it's
correct) for this case where the CSE is used 3 times. In the general case,
using an LEA may also require an additional register. Also, whether the simpler
address calcs actually provide a time-savings or not may be
implementation-dependent.

The solution to this problem may be related to whatever is implemented for <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - [x86] avoid big bad immediates in the instruction stream, part 1: use SIB addressing"
   href="show_bug.cgi?id=24447">bug
24447</a>.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>