[all-commits] [llvm/llvm-project] 105642: Add PassManagerImpl.h to hide implementation details

Reid Kleckner via All-commits all-commits at lists.llvm.org
Mon Feb 3 11:16:00 PST 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 105642af5eef694332b7181e0b333215d211332b
      https://github.com/llvm/llvm-project/commit/105642af5eef694332b7181e0b333215d211332b
  Author: Reid Kleckner <rnk at google.com>
  Date:   2020-02-03 (Mon, 03 Feb 2020)

  Changed paths:
    M llvm/include/llvm/IR/PassManager.h
    A llvm/include/llvm/IR/PassManagerImpl.h
    M llvm/lib/Analysis/CGSCCPassManager.cpp
    M llvm/lib/Analysis/LoopAnalysisManager.cpp
    M llvm/lib/IR/PassManager.cpp
    M llvm/unittests/IR/PassManagerTest.cpp

  Log Message:
  -----------
  Add PassManagerImpl.h to hide implementation details

ClangBuildAnalyzer results show that a lot of time is spent
instantiating AnalysisManager::getResultImpl across the code base:

**** Templates that took longest to instantiate:
 50445 ms: llvm::AnalysisManager<llvm::Function>::getResultImpl (412 times, avg 122 ms)
 47797 ms: llvm::AnalysisManager<llvm::Function>::getResult<llvm::TargetLibraryAnalysis> (389 times, avg 122 ms)
 46894 ms: std::tie<const unsigned long long, const bool> (2452 times, avg 19 ms)
 43851 ms: llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096, 4096>::Allocate (3228 times, avg 13 ms)
 33911 ms: std::tie<const unsigned int, const unsigned int, const unsigned int, const unsigned int> (897 times, avg 37 ms)
 33854 ms: std::tie<const unsigned long long, const unsigned long long> (1897 times, avg 17 ms)
 27886 ms: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (11156 times, avg 2 ms)

I mentioned this result to @chandlerc, and he suggested this direction.

AnalysisManager is already explicitly instantiated, and getResultImpl
doesn't need to be inlined. Move the definition to an Impl header, and
include that header in files that explicitly instantiate
AnalysisManager. There are only four (real) IR units:
- function
- module
- loop
- cgscc

Looking at a specific transform (ArgumentPromotion.cpp), here are three
compilations before & after this change:

BEFORE:
$ for i in $(seq 3) ; do ./ccit.bat ; done
peak memory: 258.15MB
real: 0m6.297s
peak memory: 257.54MB
real: 0m5.906s
peak memory: 257.47MB
real: 0m6.219s

AFTER:
$ for i in $(seq 3) ; do ./ccit.bat ; done
peak memory: 235.35MB
real: 0m5.454s
peak memory: 234.72MB
real: 0m5.235s
peak memory: 234.39MB
real: 0m5.469s

The 20MB of memory saved seems real, and the time improvement seems like
it is there.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D73817




More information about the All-commits mailing list