[cfe-dev] clang memory usage with C++ template metaprogramming

Wed Jun 9 11:52:44 PDT 2010

On 09/06/10 00:54, Douglas Gregor wrote:
> On Jun 8, 2010, at 4:03 PM, John Bytheway wrote:
>> Fair enough.  I was curious, so I ran valgrind/massif to get an
>> idea. In short:
<snip>
>> 12.85% (201,326,592B) clang::SourceManager::createInstantiationLoc 
>> 06.82% (106,841,236B) clang::TokenLexer::ExpandFunctionArguments
> 
> There must be some preprocessor metaprogramming going on this
> example, too? That's pretty big for the preprocessor.

Yes, there is.  Give me variadic templates and constexpr functions and I
can dispose of most of it :).

>> 12.83% (201,068,544B) clang::ASTContext::CreateTypeSourceInfo
> 
> Yes, this is the type-source information I mentioned. If we make
> template instantiation "perfect" with respect to type-source
> information, so that any dependent type instantiates down to
> something that structurally identical to the form it had when it was
> written in the source, then we could avoid allocating memory for
> type-source information in each type instantiation. We're not too far
> from this goal, but it has to be *perfect* for us to use the
> optimization.

A laudable goal.

>> 04.94% (77,463,552B) clang::CXXConstructorDecl::Create
>> 02.05% (32,157,696B) clang::CXXMethodDecl::Create
>> 01.82% (28,585,984B) clang::CXXDestructorDecl::Create
> 
> A number of these could be eliminated if we were to lazily create the
> implicitly-declared default constructor, copy constructor,
> copy-assignment operator, and destructor.

That sounds like the easiest of these; if it is then it's a shame these
are not a larger proportion of the problem.

>> I wonder idly: How plausible would it be to allow execution in a
>> mode where no source information was maintained, and thus reduce
>> memory usage (at the expense of useful errors/warnings)?  Such a
>> mode might be useful at times.  I'm guessing it would be
>> prohibitively difficult.
> 
> We discussed this back when we improved type-source location
> information, but I am very much against having such a mode: the AST
> should always be the same, for all clients, or the size of the
> testing matrix explodes and we get far worse coverage. We should
> spend time optimizing the system as a unified whole rather than
> trying to separate out the less-efficient bits that provide needed
> functionality.

Yeah, that feels wise.

Thanks for the insight,

John Bytheway