[cfe-dev] [RFC] Adding a different mode of "where clang spends time" reporting (timeline/flamegraph style)
Aras Pranckevicius via cfe-dev
cfe-dev at lists.llvm.org
Sun Jan 13 04:31:31 PST 2019
TL;DR: I have made a Clang/LLVM code change that adds "-ftime-trace"
option, that produces Chrome Tracing format output. Would like comments on
whether that is a good idea or not, or perhaps someone else is already
doing this. My current (WIP) patch in github PR format is here
https://github.com/aras-p/llvm-project-20170507/pull/2 -- with images and
trace output files attached.
Current implementation of "-ftime-report" has several issues, particularly
when I'm just a "user" of the compiler:
- it outputs a lot of information (almost half of it is duplicated since
- a lot of that information is things that only compiler developers would
- has quite large overhead, I've seen it make compile times take 1.5x
- has very little information about "frontend" part (preprocessing,
parsing, instantiation, C++ modules),
- the things it reports are only "summaries", i.e. "how much time it took
to do work X in total". e.g. it can tell that "inlining all functions took
X seconds", but in case there was just one super slow function to inline
thousands, it will not tell which one was the slow one.
I have written a blog post about this (as well as lack of "good" time
reporting tools in Visual Studio and gcc) recently,
At work (Unity game engine), with codebase of similar size to whole of
Clang/LLVM (2-3 million lines of code), we had really good experience in
adding timeline/flamegraph visualizations to various parts of our "build
system". This can tell us which .cpp files were slowest to compile in the
whole build, but I also wanted similar tooling for things "inside" single
.cpp file compilation.
Thus this attempt at adding a new time trace profiling mode.
I have current changes on github here,
https://github.com/aras-p/llvm-project-20170507/pull/2 -- can do a proper
"patch" thing via Phabricator if needed.
My current code change does not quite match Clang/LLVM code standards and
probably needs some work, but the general approach seems to work. Already
found one case of Clang being very slow at parsing some weird recursive
macro thingamabob that we had; was causing about 5-8 seconds just to
include one header file. I probably would have never found it without this
type of visualization. Here it is very clear that among all the things,
parsing just that one header file takes almost all the time:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev