[llvm-bugs] [Bug 24618] New: Compilation wtih Clang is 8x slower than GCC, mostly due to Greedy Register Allocator

via llvm-bugs llvm-bugs at lists.llvm.org
Fri Aug 28 12:36:55 PDT 2015


https://llvm.org/bugs/show_bug.cgi?id=24618

            Bug ID: 24618
           Summary: Compilation wtih Clang is 8x slower than GCC, mostly
                    due to Greedy Register Allocator
           Product: libraries
           Version: trunk
          Hardware: HP
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Register Allocator
          Assignee: unassignedbugs at nondot.org
          Reporter: cmtice at google.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

I have a .cpp file that takes 2 minutes and 45 seconds to compile with clang
(ToT), and only 25 seconds to compile with gcc.  Both at -O3 (times were
obtained with the unix 'time' command):


Time compiling with clang:

real    2m45.683s
user    2m45.023s
sys    0m0.539s

Time compiling with clang:

real    0m23.502s
user    0m21.340s
sys    0m2.134s


I passed -ftime-report to clang, to find out where the time was being spent and
it reported that 75% of the time is being spent int he Greedy Register
Allocator:

[...snip...]

===-------------------------------------------------------------------------===
                              Register Allocation
===-------------------------------------------------------------------------===
  Total Execution Time: 106.5713 seconds (106.6387 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  ---
Name ---
  104.8476 ( 98.4%)   0.0202 ( 50.4%)  104.8678 ( 98.4%)  104.9201 ( 98.4%) 
Spiller
   1.6598 (  1.6%)   0.0080 ( 19.9%)   1.6678 (  1.6%)   1.6695 (  1.6%) 
Global Splitting
   0.0210 (  0.0%)   0.0119 ( 29.7%)   0.0329 (  0.0%)   0.0462 (  0.0%)  Evict
   0.0018 (  0.0%)   0.0000 (  0.0%)   0.0018 (  0.0%)   0.0021 (  0.0%)  Seed
Live Regs
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0009 (  0.0%)  Local
Splitting
  106.5312 (100.0%)   0.0400 (100.0%)  106.5713 (100.0%)  106.6387 (100.0%) 
Total
[...snip...]

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 142.7354 seconds (142.7899 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  ---
Name ---
  107.0321 ( 75.4%)   0.0559 (  6.4%)  107.0881 ( 75.0%)  107.1572 ( 75.0%) 
Greedy Register Allocator
   5.3523 (  3.8%)   0.0799 (  9.1%)   5.4321 (  3.8%)   5.4367 (  3.8%)  Tail
Duplication
   5.2889 (  3.7%)   0.0000 (  0.0%)   5.2889 (  3.7%)   5.2914 (  3.7%) 
Eliminate PHI nodes for register allocation
   3.6523 (  2.6%)   0.0040 (  0.5%)   3.6562 (  2.6%)   3.6589 (  2.6%) 
Simple Register Coalescing

I will try to attach the entire timing report if I can attach two files to this
bug.  I also ran the compilation (on clang) under 'perf' to try to see where
time was being spent, and again, it looks like a large chuck of time is being
spend in the register allocator:

+  15.89%  clang++  clang-3.8            [.] (anonymous
namespace)::InlineSpiller::propagateSiblingValue(llvm::DenseMapIterator<llvm::VNInfo*,
(anonymous namespace)::InlineSpiller::SibValueInfo, llvm::DenseMapInfo<▒
+  11.19%  clang++  clang-3.8            [.]
llvm::MachineBlockFrequencyInfo::getBlockFreq(llvm::MachineBasicBlock const*)
const                                                                          
           ▒
+   5.33%  clang++  clang-3.8            [.]
llvm::BlockFrequencyInfoImplBase::getBlockFreq(llvm::BlockFrequencyInfoImplBase::BlockNode
const&) const                                                                 ▒
+   3.79%  clang++  clang-3.8            [.]
llvm::SmallPtrSetImplBase::FindBucketFor(void const*) const                    
                                                                               
         ▒
+   3.21%  clang++  clang-3.8            [.]
llvm::SSAUpdaterImpl<llvm::MachineSSAUpdater>::BuildBlockList(llvm::MachineBasicBlock*,
llvm::SmallVectorImpl<llvm::SSAUpdaterImpl<llvm::MachineSSAUpdater>::BBInfo*>*)
 ▒
+   2.84%  clang++  clang-3.8            [.]
llvm::LiveVariables::isLiveOut(unsigned int, llvm::MachineBasicBlock const&)   
                                                                               
         ▒
+   2.68%  clang++  clang-3.8            [.]
llvm::LiveRange::addSegment(llvm::LiveRange::Segment)                          
                                                                               
         ▒
+   2.62%  clang++  clang-3.8            [.]
llvm::SmallPtrSetImplBase::insert_imp(void const*)                             
                                                                               
         ▒
+   2.59%  clang++  clang-3.8            [.]
llvm::LiveRange::extendInBlock(llvm::SlotIndex, llvm::SlotIndex)               
                                                                               
         ▒
+   2.27%  clang++  clang-3.8            [.]
llvm::LiveRange::find(llvm::SlotIndex)                                         
                                                                               
         ▒
+   2.26%  clang++  clang-3.8            [.] (anonymous
namespace)::InlineSpiller::traceSiblingValue(unsigned int, llvm::VNInfo*,
llvm::VNInfo*)                                                                 
    ▒
+   1.83%  clang++  clang-3.8            [.]
llvm::BlockFrequency::operator*(llvm::BranchProbability const&) const          
                                                                               
         ▒
+   1.43%  clang++  clang-3.8            [.]
extendSegmentsToUses(llvm::LiveRange&, llvm::SlotIndexes const&,
llvm::SmallVector<std::pair<llvm::SlotIndex, llvm::VNInfo*>, 16u>&,
llvm::LiveRange const&)             ▒
+   1.07%  clang++  clang-3.8            [.]
llvm::BlockFrequency::operator*=(llvm::BranchProbability const&)               
                                                                               
         ▒
+   0.71%  clang++  clang-3.8            [.]
llvm::MachineBasicBlock::isSuccessor(llvm::MachineBasicBlock const*) const     
                                                                               
         ▒
+   0.69%  clang++  clang-3.8            [.]
llvm::ConnectedVNInfoEqClasses::Classify(llvm::LiveInterval const*)            
                                                                               
         ▒
+   0.56%  clang++  [kernel.kallsyms]    [k] 0xffffffff8104f45a                
                                                                               
                                                      ▒
+   0.44%  clang++  clang-3.8            [.]
llvm::DominatorTreeBase<llvm::MachineBasicBlock>::dominates(llvm::MachineBasicBlock
const*, llvm::MachineBasicBlock const*) const                                  
     ▒
+   0.43%  clang++  libc-2.19.so         [.] _int_malloc                       
                                                                               
                                                      ▒
+   0.36%  clang++  clang-3.8            [.]
llvm::SlotIndexes::getMBBFromIndex(llvm::SlotIndex) const                      
                                                                               
         ▒
+   0.35%  clang++  libc-2.19.so         [.] __memcpy_sse2_unaligned           
                                                                               
                                                      ▒
+   0.32%  clang++  clang-3.8            [.] llvm::Use::getImpliedUser() const  



I have attached the .ii file.  Below is the command to compile this file with
clang.


/usr/local/google2/cmtice/llvm-work/llvm-install.opt/bin/clang++  -c   
-fno-exceptions -Wno-multichar -m64 -Wa,--noexecstack -fPIC
-no-canonical-prefixes  -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector
-D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS -DANDROID -fmessage-length=0 -W
-Wall -Wno-unused     -Winit-self -Wpointer-arith -g -fno-strict-aliasing
-DNDEBUG -UDEBUG           -D__compiler_offsetof=__builtin_offsetof
-Werror=int-conversion -Wno-reserved-id-macro -Wno-format-pedantic
-Wno-unused-command-line-argument   -target x86_64-linux-gnu   -DANDROID
-fmessage-length=0 -W -Wall -Wno-unused -Winit-self -Wpointer-arith
-Wsign-promo -DNDEBUG -UDEBUG  -Wno-inconsistent-missing-override   -target
x86_64-linux-gnu  -DBUILDING_LIBART=1 -Wthread-safety -Wthread-safety-negative
-Wimplicit-fallthrough -Wfloat-equal -Wint-to-void-pointer-cast
-Wused-but-marked-unused -Wdeprecated -Wunreachable-code-break
-Wunreachable-code-return -Wmissing-noreturn -fno-omit-frame-pointer -fno-rtti
-std=gnu++11 -ggdb3 -Wall -Werror -Wextra -Wstrict-aliasing -fstrict-aliasing
-Wunreachable-code -Wredundant-decls -Wshadow -Wunused -fvisibility=protected
-DART_DEFAULT_GC_TYPE_IS_CMS -DIMT_SIZE=64 -DART_BASE_ADDRESS=0x60000000
-DART_DEFAULT_INSTRUCTION_SET_FEATURES=default
-DART_BASE_ADDRESS_MIN_DELTA=-0x1000000 -DART_BASE_ADDRESS_MAX_DELTA=0x1000000
-DART_DEFAULT_INSTRUCTION_SET_FEATURES="default" -O3 -Wframe-larger-than=2700
-fPIC -D_USING_LIBCXX -std=gnu++14 -nostdinc++  -Werror=int-to-pointer-cast
-Werror=pointer-to-int-cast  -Werror=address-of-temporary
-Werror=null-dereference -Werror=return-type -o interpreter_goto_table_impl.o
./interpreter_goto_table_impl.ii

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150828/3f382a57/attachment-0001.html>


More information about the llvm-bugs mailing list