<html>
    <head>
      <base href="http://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - GlobalISel: placement of constants in the entry-block and fast regalloc result in lots of reloaded constants"
   href="http://bugs.llvm.org/show_bug.cgi?id=32561">32561</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>GlobalISel: placement of constants in the entry-block and fast regalloc result in lots of reloaded constants
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>GlobalISel
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>ahmed.bougacha@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>We currently always emit constants (including globals and ConstantExprs) and
G_FRAME_INDEX in the entry block.

At -O0, the fast register allocator (and the lack of constant
rematerialization) causes us to spill every constant, and reload it at every
use.

This is particularly egregious for G_FRAME_INDEX, which is omnipresent at -O0.

We've known about this since the initial prototype, and it's starting to become
a problem.

We've mitigated a lot of this by teaching the AArch64 selector to do various
foldings (that we should do regardless), but the issue remains, especially for
other targets.

We considered various alternative schemes:
- emit constants once per block (this is roughly equivalent to
DAGISel/FastISel)
- emit constants at every use
- emit constants in the entry block, but move them closer to uses in a separate
pass
- "select" constants by duplicating them at each use

The first two have the problem of making the IRTranslator different depending
on optimization level (because we'd disable the duplication at at higher
optimization levels, as it can be pessimizing if it doesn't result in folding
that would happen with entry-block-constants anyway), which is something we've
tried to avoid.

The 3rd is conceptually awkward, especially considering the bar for adding new
passes is very high.

In the experiments we conducted with Quentin, the first three options resulted
in overall improvements in codesize.

I haven't tried the last one.

At higher optimization levels, there's room for smarter placement, but we don't
want to have a full-blown rematerialization pass at O0.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>