[PATCH] D43578: -ftime-report switch support in Clang

Andrew V. Tischenko via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 12 08:57:03 PDT 2018


avt77 updated this revision to Diff 138021.
avt77 added a comment.

I changed the timers. Now we could get something like here:

-------------------------------------------------------------------------
-------------------------------------------------------------------------

  ===== Frontend =====

-------------------------------------------------------------------------
-------------------------------------------------------------------------

  Total Execution Time: 0.3760 seconds (0.3659 wall clock)
  
   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.2080 ( 59.1%)   0.0160 ( 66.7%)   0.2240 ( 59.6%)   0.2157 ( 58.9%)  Handle Top Level Decl
   0.1120 ( 31.8%)   0.0000 (  0.0%)   0.1120 ( 29.8%)   0.1152 ( 31.5%)  Handle Translation Unit
   0.0200 (  5.7%)   0.0080 ( 33.3%)   0.0280 (  7.4%)   0.0253 (  6.9%)  Handle Tag Decl Required Definition
   0.0080 (  2.3%)   0.0000 (  0.0%)   0.0080 (  2.1%)   0.0097 (  2.6%)  Handle Inline Function Definition
   0.0040 (  1.1%)   0.0000 (  0.0%)   0.0040 (  1.1%)   0.0000 (  0.0%)  Init Backend Consumer
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Link In Modules
   0.3520 (100.0%)   0.0240 (100.0%)   0.3760 (100.0%)   0.3659 (100.0%)  Total

-------------------------------------------------------------------------
-------------------------------------------------------------------------

  ===== Clang Parser =====

-------------------------------------------------------------------------
-------------------------------------------------------------------------

  Total Execution Time: 6.7240 seconds (6.7310 wall clock)
  
   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   4.5520 ( 76.9%)   0.5560 ( 68.8%)   5.1080 ( 76.0%)   5.1109 ( 75.9%)  Parse Top Level Decl
   0.6680 ( 11.3%)   0.1200 ( 14.9%)   0.7880 ( 11.7%)   0.7732 ( 11.5%)  Parse Template
   0.6440 ( 10.9%)   0.1240 ( 15.3%)   0.7680 ( 11.4%)   0.7459 ( 11.1%)  Parse Function Definition
   0.0360 (  0.6%)   0.0040 (  0.5%)   0.0400 (  0.6%)   0.0654 (  1.0%)  Scope manipulation
   0.0120 (  0.2%)   0.0040 (  0.5%)   0.0160 (  0.2%)   0.0267 (  0.4%)  PP Macro Call Args
   0.0040 (  0.1%)   0.0000 (  0.0%)   0.0040 (  0.1%)   0.0059 (  0.1%)  PP Append Macro
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0029 (  0.0%)  Handle Pragma Directive
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%)  PP Find Handler
   5.9160 (100.0%)   0.8080 (100.0%)   6.7240 (100.0%)   6.7310 (100.0%)  Total

-------------------------------------------------------------------------
-------------------------------------------------------------------------

  ===== Sema =====

-------------------------------------------------------------------------
-------------------------------------------------------------------------

  Total Execution Time: 3.0480 seconds (3.0127 wall clock)
  
   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   1.0760 ( 38.9%)   0.0960 ( 33.8%)   1.1720 ( 38.5%)   1.1687 ( 38.8%)  Act On End Of Translation Unit: Common case
   1.0760 ( 38.9%)   0.0960 ( 33.8%)   1.1720 ( 38.5%)   1.1681 ( 38.8%)  Act On End Of Translation Unit: TUKind != TU_Prefix
   0.2520 (  9.1%)   0.0440 ( 15.5%)   0.2960 (  9.7%)   0.2534 (  8.4%)  Handle Declarator
   0.1160 (  4.2%)   0.0120 (  4.2%)   0.1280 (  4.2%)   0.1453 (  4.8%)  Act On Id Expression
   0.0520 (  1.9%)   0.0120 (  4.2%)   0.0640 (  2.1%)   0.0827 (  2.7%)  Classify Name
   0.0520 (  1.9%)   0.0120 (  4.2%)   0.0640 (  2.1%)   0.0612 (  2.0%)  Check Implicit Conversion
   0.0480 (  1.7%)   0.0040 (  1.4%)   0.0520 (  1.7%)   0.0485 (  1.6%)  Check Call
   0.0560 (  2.0%)   0.0040 (  1.4%)   0.0600 (  2.0%)   0.0465 (  1.5%)  Lookup Template Name
   0.0280 (  1.0%)   0.0000 (  0.0%)   0.0280 (  0.9%)   0.0160 (  0.5%)  Act On String Literal
   0.0040 (  0.1%)   0.0040 (  1.4%)   0.0080 (  0.3%)   0.0137 (  0.5%)  Check Builtin Function Call
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0049 (  0.2%)  Act On Predefined Expr
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0022 (  0.1%)  Check X86 Builtin Function Call
   0.0040 (  0.1%)   0.0000 (  0.0%)   0.0040 (  0.1%)   0.0014 (  0.0%)  Act On Dependent Id Expression
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Try To Recover With Call
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Try Expr As Call
   2.7640 (100.0%)   0.2840 (100.0%)   3.0480 (100.0%)   3.0127 (100.0%)  Total

-------------------------------------------------------------------------
-------------------------------------------------------------------------

  ===== Include Files =====

-------------------------------------------------------------------------
-------------------------------------------------------------------------

  Total Execution Time: 0.0600 seconds (0.0540 wall clock)
  
   ---User Time---   --User+System--   ---Wall Time---  --- Name ---
   0.0320 ( 53.3%)   0.0320 ( 53.3%)   0.0292 ( 54.0%)  Lookup File2
   0.0200 ( 33.3%)   0.0200 ( 33.3%)   0.0218 ( 40.3%)  Lookup File
   0.0040 (  6.7%)   0.0040 (  6.7%)   0.0016 (  3.0%)  Find Usable Module For Header
   0.0040 (  6.7%)   0.0040 (  6.7%)   0.0014 (  2.7%)  Should Enter Include File
   0.0600 (100.0%)   0.0600 (100.0%)   0.0540 (100.0%)  Total

The exact picture very depends on content of the compiling file but in general it's clear what we have.
About the performance question: I compared boostrap times produced by pure trunk compiler and my compiler with timers. I got the following results (obviously I used the patched compiler w/o option -ftime-report):

Trunk compiler (4 bootstraps were created):
real    43m19.845s
user    161m22.448s
sys     6m57.480s

real    42m54.171s
user    160m45.444s
sys     6m50.236s

real    43m43.098s
user    161m8.112s
sys     6m46.944s

real    42m59.542s
user    160m48.304s
sys     6m48.616s

The patched compiler (3 bootstraps):

real    44m13.653s
user    162m15.588s
sys     6m55.864s

real    44m7.108s
user    162m7.740s
sys     6m47.848s

real    43m18.520s
user    162m14.008s
sys     6m44.828s

As we see in the worst case we have degradation less than 1.2%. From my point of view it's absolutely acceptable.
Any comments, suggestions, objections?


https://reviews.llvm.org/D43578

Files:
  include/clang/Frontend/FrontendAction.h
  include/clang/Lex/HeaderSearch.h
  include/clang/Parse/Parser.h
  include/clang/Sema/Sema.h
  lib/CodeGen/CodeGenAction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/Frontend/ASTMerge.cpp
  lib/Frontend/CompilerInstance.cpp
  lib/Frontend/FrontendAction.cpp
  lib/Frontend/FrontendActions.cpp
  lib/Lex/HeaderSearch.cpp
  lib/Lex/PPMacroExpansion.cpp
  lib/Lex/Pragma.cpp
  lib/Parse/ParseTemplate.cpp
  lib/Parse/Parser.cpp
  lib/Sema/Sema.cpp
  lib/Sema/SemaChecking.cpp
  lib/Sema/SemaDecl.cpp
  lib/Sema/SemaExpr.cpp
  lib/Sema/SemaTemplate.cpp
  test/Frontend/ftime-report-template-decl.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D43578.138021.patch
Type: text/x-patch
Size: 49941 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180312/92f7c2d8/attachment.bin>


More information about the llvm-commits mailing list