[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
Chuanqi Xu via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Tue Oct 22 18:44:25 PDT 2024
ChuanqiXu9 wrote:
> @usx95 may be able to help with the reproducer.
>
> In the meantime, I'm trying to collect some information on the compile times. So far it looks like we have a ~10-15x compile time regression on some translation units. Without this patch `-ftime-report` shows:
>
> ```
> ===-------------------------------------------------------------------------===
> Clang front-end time report
> ===-------------------------------------------------------------------------===
> Total Execution Time: 39.1940 seconds (39.7238 wall clock)
>
> ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
> 28.2611 ( 77.5%) 1.8439 ( 67.3%) 30.1050 ( 76.8%) 30.5230 ( 76.8%) Clang front-end timer
> 8.1911 ( 22.5%) 0.8980 ( 32.7%) 9.0891 ( 23.2%) 9.2009 ( 23.2%) Reading modules
> 36.4522 (100.0%) 2.7419 (100.0%) 39.1940 (100.0%) 39.7238 (100.0%) Total
> ```
>
> With it:
>
> ```
> ===-------------------------------------------------------------------------===
> Clang front-end time report
> ===-------------------------------------------------------------------------===
> Total Execution Time: 466.7373 seconds (1251.6300 wall clock)
> ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
> 404.7200 ( 96.1%) 40.6383 ( 88.8%) 445.3583 ( 95.4%) 471.9647 ( 37.7%) Clang front-end timer
> 15.2098 ( 3.6%) 3.3586 ( 7.3%) 18.5684 ( 4.0%) 398.1242 ( 31.8%) Reading modules
> 420.9899 (100.0%) 45.7474 (100.0%) 466.7373 (100.0%) 1251.6300 (100.0%) Total
> ```
>
> `perf record -g` / `perf report` give the following picture:
>
> ```
> Children Self Command Shared Object Symbol
> + 94.85% 0.00% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformCallExpr(clang::CallExpr*) [clone .__uniq.16014532493918845222783194145290083557] ◆
> + 93.47% 0.00% clang clang [.] clang::Sema::InstantiateFunctionDefinition(clang::SourceLocation, clang::FunctionDecl*, bool, bool, bool) ▒
> + 93.37% 83.51% clang clang [.] clang::ASTReader::LoadExternalSpecializations(clang::Decl const*, bool) ▒
> + 93.19% 0.00% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformCompoundStmt(clang::CompoundStmt*, bool) [clone .__uniq.16014532493918845222783194▒
> + 93.08% 0.00% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformUnresolvedLookupExpr(clang::UnresolvedLookupExpr*, bool) [clone .__uniq.1601453249▒
> + 92.98% 0.00% clang clang [.] clang::Sema::BuildTemplateIdExpr(clang::CXXScopeSpec const&, clang::SourceLocation, clang::LookupResult&, bool, clang::TemplateArgumentListInfo const*) ▒
> + 92.44% 0.00% clang clang [.] clang::Sema::CheckVarTemplateId(clang::VarTemplateDecl*, clang::SourceLocation, clang::SourceLocation, clang::TemplateArgumentListInfo const&) ▒
> + 92.08% 0.00% clang clang [.] clang::Sema::InstantiateVariableInitializer(clang::VarDecl*, clang::VarDecl*, clang::MultiLevelTemplateArgumentList const&) ▒
> + 91.87% 0.00% clang clang [.] clang::VarTemplateDecl::getPartialSpecializations(llvm::SmallVectorImpl<clang::VarTemplatePartialSpecializationDecl*>&) const ▒
> + 91.18% 0.00% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformBinaryOperator(clang::BinaryOperator*) [clone .__uniq.1601453249391884522278319414▒
> + 91.07% 0.00% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformExprs(clang::Expr* const*, unsigned int, bool, llvm::SmallVectorImpl<clang::Expr*>▒
> + 90.70% 0.01% clang clang [.] clang::Sema::InstantiateVariableDefinition(clang::SourceLocation, clang::VarDecl*, bool, bool, bool) ▒
> + 90.41% 0.01% clang clang [.] clang::Sema::BuildDeclarationNameExpr(clang::CXXScopeSpec const&, clang::DeclarationNameInfo const&, clang::NamedDecl*, clang::NamedDecl*, clang::TemplateArgu▒
> + 90.29% 0.00% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformInitListExpr(clang::InitListExpr*) [clone .__uniq.16014532493918845222783194145290▒
> + 89.92% 0.00% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformParenExpr(clang::ParenExpr*) [clone .__uniq.16014532493918845222783194145290083557▒
> + 89.23% 0.00% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformConditionalOperator(clang::ConditionalOperator*) [clone .__uniq.160145324939188452▒
> + 84.49% 0.02% clang clang [.] clang::Sema::RequireCompleteTypeImpl(clang::SourceLocation, clang::QualType, clang::Sema::CompleteTypeKind, clang::Sema::TypeDiagnoser*) ▒
> + 84.47% 0.00% clang clang [.] clang::Sema::InstantiateClassTemplateSpecialization(clang::SourceLocation, clang::ClassTemplateSpecializationDecl*, clang::TemplateSpecializationKind, bool) ▒
> + 84.07% 0.01% clang clang [.] clang::Sema::InstantiateClass(clang::SourceLocation, clang::CXXRecordDecl*, clang::CXXRecordDecl*, clang::MultiLevelTemplateArgumentList const&, clang::Templa▒
> + 82.84% 0.02% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformType(clang::TypeLocBuilder&, clang::TypeLoc) [clone .__uniq.1601453249391884522278▒
> + 82.23% 0.02% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformTemplateSpecializationType(clang::TypeLocBuilder&, clang::TemplateSpecializationTy▒
> + 81.99% 0.01% clang clang [.] (anonymous namespace)::TemplateInstantiator::TransformTemplateArgument(clang::TemplateArgumentLoc const&, clang::TemplateArgumentLoc&, bool) [clone .__uniq.16▒
> + 81.54% 0.00% clang clang [.] clang::Sema::RequireCompleteDeclContext(clang::CXXScopeSpec&, clang::DeclContext*) ▒
> + 80.18% 0.01% clang clang [.] clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformType(clang::TypeSourceInfo*) [clone .__uniq.16014532493918845222783194145290083557▒
> + 79.88% 0.12% clang clang [.] clang::Sema::CheckTemplateIdType(clang::TemplateName, clang::SourceLocation, clang::TemplateArgumentListInfo&) ▒
> ```
>
> I can try to build clang with better debug information and get a higher fidelity profile, but hopefully this already shows the direction to look at.
Thanks. It looks like `ASTReader::LoadExternalSpecializations(const Decl *D, bool OnlyPartial)` is the hot spot. I didn't think about it. Maybe the problem here is `findAll()`? Since we would always load all the specializations. Or the problem is we may call `findAll()` too many times. I'll try to take a look. And a profiling result with more information will be definitely helpful.
https://github.com/llvm/llvm-project/pull/83237
More information about the llvm-branch-commits
mailing list