[llvm-bugs] [Bug 32561] New: GlobalISel: placement of constants in the entry-block and fast regalloc result in lots of reloaded constants

via llvm-bugs llvm-bugs at lists.llvm.org
Thu Apr 6 17:37:01 PDT 2017


http://bugs.llvm.org/show_bug.cgi?id=32561

            Bug ID: 32561
           Summary: GlobalISel: placement of constants in the entry-block
                    and fast regalloc result in lots of reloaded constants
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: GlobalISel
          Assignee: unassignedbugs at nondot.org
          Reporter: ahmed.bougacha at gmail.com
                CC: llvm-bugs at lists.llvm.org

We currently always emit constants (including globals and ConstantExprs) and
G_FRAME_INDEX in the entry block.

At -O0, the fast register allocator (and the lack of constant
rematerialization) causes us to spill every constant, and reload it at every
use.

This is particularly egregious for G_FRAME_INDEX, which is omnipresent at -O0.

We've known about this since the initial prototype, and it's starting to become
a problem.

We've mitigated a lot of this by teaching the AArch64 selector to do various
foldings (that we should do regardless), but the issue remains, especially for
other targets.

We considered various alternative schemes:
- emit constants once per block (this is roughly equivalent to
DAGISel/FastISel)
- emit constants at every use
- emit constants in the entry block, but move them closer to uses in a separate
pass
- "select" constants by duplicating them at each use

The first two have the problem of making the IRTranslator different depending
on optimization level (because we'd disable the duplication at at higher
optimization levels, as it can be pessimizing if it doesn't result in folding
that would happen with entry-block-constants anyway), which is something we've
tried to avoid.

The 3rd is conceptually awkward, especially considering the bar for adding new
passes is very high.

In the experiments we conducted with Quentin, the first three options resulted
in overall improvements in codesize.

I haven't tried the last one.

At higher optimization levels, there's room for smarter placement, but we don't
want to have a full-blown rematerialization pass at O0.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170407/1e7940f7/attachment.html>


More information about the llvm-bugs mailing list