[LLVMbugs] [Bug 20014] New: Pathological register allocation within gigantic single basic block function (x86-64)

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Thu Jun 12 10:15:54 PDT 2014


http://llvm.org/bugs/show_bug.cgi?id=20014

            Bug ID: 20014
           Summary: Pathological register allocation within gigantic
                    single basic block function (x86-64)
           Product: new-bugs
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: vlad at drpetric.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

The code in question is from an Othello minimax engine. 
https://github.com/vladpetric/ntest/branches (jmobs2 branch)

The static evaluation function of ntest is a huge basic block, of around 550
instructions. It's a complex basic block, with a lot of live values.

Unfortunately, clang produces no fewer than 75 spills! I count a spill as a
store/load pair to the stack, which includes pushes, pops, moves and other
instructions with an operand in memory and using %rsp (callee-managed registers
are included here as well). At the same time gcc 4.8.2 generates 9, and icc
only 4. Given that this is a single linear basic block, all instructions are
executed exactly the same number of times.

Instruction statistics for the function CEvaluator::EvalMobs :

gcc 4.8.2:
    Total st stores     loads  st loads     moves       alu      mult    popcnt 
      547         9       122         9       117       261        28         1 

icc:
    Total st stores     loads  st loads     moves       alu      mult    popcnt 
      541         4       122         4       122       260        28         1

clang (3.5 trunk):
    Total st stores     loads  st loads     moves       alu      mult    popcnt 
      704        76       122        75        81       321        28         1

A few notes:

* there are no branches or non-stack stores. st loads mean stack loads, st
stores mean stack stores.

* non-stack loads are loads from evaluation tables. The mults come primarily
from the "magic multiplier" bit gather trick. popcnt - well, that's explicitly
called. There's a constant number of them.

* an x86_64 ALU instruction that has a memory operand is counted twice - both
ALU and load. Such operations are likely broken into two micro-ops, generally
speaking (one load and one alu).

The code can be compiled by fetching the sourcecode, then a straightfoward
cmake project, but I've also attached the .ll file which includes the
pathological function (CEvaluator::EvalMobs).

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20140612/d024cf9f/attachment.html>


More information about the llvm-bugs mailing list