[llvm-bugs] [Bug 24850] New: LLVM built 445.gobmk is 17% slower than gcc on power8
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Sep 16 14:42:03 PDT 2015
https://llvm.org/bugs/show_bug.cgi?id=24850
Bug ID: 24850
Summary: LLVM built 445.gobmk is 17% slower than gcc on power8
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Backend: PowerPC
Assignee: unassignedbugs at nondot.org
Reporter: carrot at google.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
LLVM built 445.gobmk is 17% slower than gcc built binary on power8.
gcc 438s
llvm 512s
For input data trevord.tst, llvm is 18% slower.
The problem is in function popgo. In gcc built binary it consumes 4.11% of
time, in llvm built binary it consumes 13.98% of time.
The related code snippet is in engine/board.c:
struct change_stack_entry {
int *address;
int value;
};
static struct change_stack_entry *change_stack_pointer;
#define POP_MOVE()\
while ((--change_stack_pointer)->address)\
*(change_stack_pointer->address) =\
change_stack_pointer->value
LLVM generated code sequence is:
68.05 : 1000a9f0: ld r3,-22832(r29) // A
0.66 : 1000a9f4: addi r4,r3,-16
0.17 : 1000a9f8: std r4,-22832(r29) // B
0.02 : 1000a9fc: ori r2,r2,0
14.30 : 1000aa00: ld r4,-16(r3)
0.00 : 1000aa04: cmpldi r4,0
0.00 : 1000aa08: beq 1000aa18 <popgo+0xa8>
0.53 : 1000aa0c: lwz r3,-8(r3)
0.11 : 1000aa10: stw r3,0(r4)
0.00 : 1000aa14: b 1000a9f0 <popgo+0x80>
Instruction A reads variable change_stack_pointer, instruction B writes
change_stack_pointer.
GCC generated code sequence is:
48.30 : 10010280: lwz r8,24(r9)
0.00 : 10010284: mr r7,r9
0.00 : 10010288: addi r9,r9,-16
0.63 : 1001028c: stw r8,0(r10)
0.00 : 10010290: ld r10,16(r9)
0.00 : 10010294: cmpdi cr7,r10,0
0.00 : 10010298: bne cr7,10010280 <popgo+0x90>
15.54 : 1001029c: nop
Note that variable change_stack_pointer is in register r9, it reads it at the
start of the function, and writes it after the loop. Since the address of
change_stack_pointer is never assigned to another variable, and it's a static
variable, so it can't be aliased with any other pointer, so it is safe to do
this optimization.
Even if I add -fstrict-aliasing explicitly to llvm command line, it can move
the read of change_stack_pointer out of the loop, but still contains write of
change_stack_pointer in the loop.
Command line options are:
-DSPEC_CPU -DNDEBUG -DHAVE_CONFIG_H -I. -I.. -I../include -I./include
-fno-strict-aliasing -O2 -m64 -mvsx -mcpu=power8 -DSPEC_CPU_LP64
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150916/f9d9798d/attachment.html>
More information about the llvm-bugs
mailing list