[LLVMbugs] [Bug 6054] New: Virtual methods de-virtualization

bugzilla-daemon at cs.uiuc.edu bugzilla-daemon at cs.uiuc.edu
Sat Jan 16 01:14:49 PST 2010


http://llvm.org/bugs/show_bug.cgi?id=6054

           Summary: Virtual methods de-virtualization
           Product: new-bugs
           Version: 2.6
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Keywords: code-quality, new-feature
          Severity: normal
          Priority: P2
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: bearophile at mailas.com
                CC: llvmbugs at cs.uiuc.edu


A good way to spot places where a compiler can be improved is to compare the
runtime of a compiled program with different compilers. This must be done with
care to be sure to compare apples with apples if the two compilers are very
different.

This small program stresses virtual method calls. I have compared a small C++
program compiled with llvm-g++ with a similar Java program JIT-compiled with
Java HotSpot.

The program creates a binary tree with 932465 nodes, and then scans it 600
times computing the sum of the values stored in the leaves, and increments such
leave values each time. The tree contains two different kinds of nodes,
internal ones and leaves, so the walking on the tree is done with a virtual
function (the same problem can be solved in other ways, using tagged structs or
tagged struct pointers, but such ways are quite bug-prone and require a good
amount of time to be implemented, so in most situations it's much better if the
compiler is able to optimize things better). Here de-virtualizing the two
virtual calls probably leads to a performance improvement.

I have used a little portable random number generator to create the same trees
in both C++ and Java.


Code compiled with:

LLVM 2.6 gcc version 4.2.1 (Based on Apple Inc. build 5649) (LLVM build)

Java build 1.7.0-ea-b76

llvm-g++ -Wall -O3 -s -Wl,--enable-stdcall-fixup -fomit-frame-pointer
-march=native -ffast-math

java -server SumTree

CPU: Celeron 2.33 GHz, on Windows Vista 32 bit

As you can see from the timings it's not a matter of GC and memory allocations,
the building phase in the C++ is actually faster:

Timings, just tree building, best of 3, seconds:
  C++:   0.17
  Java:  0.55

Timings full program, best of 3, seconds:
  C++:  12.47
  Java:  7.90

--------------------------------------------

Asm of Internal.walkSum produced by llvm-g++:

    pushl   %edi
    pushl   %esi
    subl    $4, %esp
    movl    16(%esp), %esi
    movl    4(%esi), %eax
    testl   %eax, %eax
    je  LBB2_5
    movl    (%eax), %ecx
    movl    %eax, (%esp)
    call    *(%ecx)
LBB2_2:
    movl    %eax, %edi
    movl    8(%esi), %eax
    testl   %eax, %eax
    je  LBB2_6
    movl    (%eax), %ecx
    movl    %eax, (%esp)
    call    *(%ecx)
LBB2_4:
    addl    %edi, %eax
    addl    $4, %esp
    popl    %esi
    popl    %edi
    ret
LBB2_5:
    xorl    %eax, %eax
    jmp LBB2_2
LBB2_6:
    xorl    %eax, %eax
    jmp LBB2_4

--------------------------------------------

In attach you can find the asm produced by the JVM too, but it's not easy to
read.

Asm of Internal.walkSum from the Java version.
time java -XX:+PrintOptoAssembly -server SumTree
JVM build 1.7.0-ea-fastdebug-b76


-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list