[llvm-bugs] [Bug 24620] New: BranchProbabilities::scale is very hot function but it's assembly is very inefficient.
via llvm-bugs
llvm-bugs at lists.llvm.org
Fri Aug 28 13:09:00 PDT 2015
https://llvm.org/bugs/show_bug.cgi?id=24620
Bug ID: 24620
Summary: BranchProbabilities::scale is very hot function but
it's assembly is very inefficient.
Product: libraries
Version: trunk
Hardware: HP
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Support Libraries
Assignee: unassignedbugs at nondot.org
Reporter: cmtice at google.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
Created attachment 14791
--> https://llvm.org/bugs/attachment.cgi?id=14791&action=edit
gzip'd .ii file
While recently examining a performance problem in clang (8x slower than GCC,
see https://llvm.org/bugs/show_bug.cgi?id=24618), we looked at the results of
running 'perf' on clang and saw that in this case the hottest function was
llvm::BranchProbabilities::scale (20.69% of the entire compilation was being
spent in this function).
Looking more closely at the function's assembly, annotated with perf results we
saw:
0.08 │ xor %edx,%edx
0.15 │ imul %rax,%rdi
2.51 │ shr $0x20,%rcx
0.00 │ imul %rax,%rcx
0.93 │ mov %rdi,%rsi
0.45 │ mov %rcx,%rax
0.86 │ shr $0x20,%rsi
0.69 │ shr $0x20,%rax
1.01 │ add %esi,%ecx
0.41 │ mov $0xffffffffffffffff,%rsi
0.26 │ setb %dl
0.55 │ add %edx,%eax
0.85 │ cmp %eax,%r8d
│ ↓ ja 50
│49: mov %rsi,%rax
1.33 │ ← retq
│ nop
0.93 │50: shl $0x20,%rax
0.33 │ mov %ecx,%ecx
│ xor %edx,%edx
0.05 │ or %rcx,%rax
1.00 │ mov $0xffffffff,%r9d
0.27 │ div %r8
32.45 │ cmp %r9,%rax
1.14 │ mov %rax,%rcx
0.74 │ ↑ ja 49
0.98 │ mov %rdx,%rax
0.08 │ mov %edi,%edi
0.03 │ xor %edx,%edx
0.40 │ shl $0x20,%rax
0.94 │ shl $0x20,%rcx
0.03 │ or %rdi,%rax
0.50 │ div %r8
43.53 │ add %rcx,%rax
1.25 │ cmovae %rax,%rsi
2.61 │ ↑ jmp 49
It appears that nearly 75% of the time in this function is being spent on the
two 'div' ops. This assembly is very inefficient.. the two div's ought to be
done together, thus possibly halving the time spent in this function.
(This is on intel x86_64, BTW, in case it's not obvious from the assembly).
This is with ToT Clang/LLVM, but with:
$ cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=/tmp/llvm-install.opt
-DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=On <path-to-llvm>
$ make all
$ make install
Attached is a gzip'd version of the .ii file we used. The clang command to
compile this file is:
/usr/local/google2/cmtice/llvm-work/llvm-install.opt/bin/clang++ -c
-fno-exceptions -Wno-multichar -m64 -Wa,--noexecstack -fPIC
-no-canonical-prefixes -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector
-D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS -DANDROID -fmessage-length=0 -W
-Wall -Wno-unused -Winit-self -Wpointer-arith -g -fno-strict-aliasing
-DNDEBUG -UDEBUG -D__compiler_offsetof=__builtin_offsetof
-Werror=int-conversion -Wno-reserved-id-macro -Wno-format-pedantic
-Wno-unused-command-line-argument -target x86_64-linux-gnu -DANDROID
-fmessage-length=0 -W -Wall -Wno-unused -Winit-self -Wpointer-arith
-Wsign-promo -DNDEBUG -UDEBUG -Wno-inconsistent-missing-override -target
x86_64-linux-gnu -DBUILDING_LIBART=1 -Wthread-safety -Wthread-safety-negative
-Wimplicit-fallthrough -Wfloat-equal -Wint-to-void-pointer-cast
-Wused-but-marked-unused -Wdeprecated -Wunreachable-code-break
-Wunreachable-code-return -Wmissing-noreturn -fno-omit-frame-pointer -fno-rtti
-std=gnu++11 -ggdb3 -Wall -Werror -Wextra -Wstrict-aliasing -fstrict-aliasing
-Wunreachable-code -Wredundant-decls -Wshadow -Wunused -fvisibility=protected
-DART_DEFAULT_GC_TYPE_IS_CMS -DIMT_SIZE=64 -DART_BASE_ADDRESS=0x60000000
-DART_DEFAULT_INSTRUCTION_SET_FEATURES=default
-DART_BASE_ADDRESS_MIN_DELTA=-0x1000000 -DART_BASE_ADDRESS_MAX_DELTA=0x1000000
-DART_DEFAULT_INSTRUCTION_SET_FEATURES="default" -O3 -Wframe-larger-than=2700
-fPIC -D_USING_LIBCXX -std=gnu++14 -nostdinc++ -Werror=int-to-pointer-cast
-Werror=pointer-to-int-cast -Werror=address-of-temporary
-Werror=null-dereference -Werror=return-type -o interpreter_goto_table_impl.o
./interpreter_goto_table_impl.ii
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150828/0b6c3f2e/attachment.html>
More information about the llvm-bugs
mailing list