[cfe-dev] [RFC] Adding a different mode of "where clang spends time" reporting (timeline/flamegraph style)

Bruno Ricci via cfe-dev cfe-dev at lists.llvm.org
Wed Jan 23 05:49:59 PST 2019


Just to say that I am also interested by something like this, although more from
the point of view of someone working on the compiler itself. I think however that
it would make sense to find a way to have one system for generating time reports
which would be both useful to people working on clang and people using clang.

Another thing I would like to have is a way to trace memory allocations, both in the
BumpAllocators and in malloc, and both in time and where they are triggered from
logically.

Bruno

On 23/01/2019 01:00, Reid Kleckner via cfe-dev wrote:
> This is pretty cool!
> 
> I think the question is, "what next"? I don't really want to say "merge it right away!", because then we will have two systems for generating time reports, but I also don't want to send you off into the hills to refactor the existing Timer to support both output formats, and then decide what to do with all the fine-grained timers geared towards compiler developers. Would you be interested in working in that that direction, or would you rather hand the code off to someone else to try to integrate it more tightly?
> 
> I feel like right now C++ users don't have a lot of visibility into build performance, so it ends up falling to specialist toolchain people to go and debug these problems. If we made it easy to see into the performance of compilation, it would democratize build time optimization, letting generalists do these kinds of codebase cleanups. At least, I hope it will. :)
> 
> On Sun, Jan 13, 2019 at 4:31 AM Aras Pranckevicius via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
> 
>     Hello!
> 
>     TL;DR: I have made a Clang/LLVM code change that adds "-ftime-trace" option, that produces Chrome Tracing format output. Would like comments on whether that is a good idea or not, or perhaps someone else is already doing this. My current (WIP) patch in github PR format is here https://github.com/aras-p/llvm-project-20170507/pull/2 -- with images and trace output files attached.
> 
>     Longer version:
> 
>     Current implementation of "-ftime-report" has several issues, particularly when I'm just a "user" of the compiler:
> 
>     - it outputs a lot of information (almost half of it is duplicated since clang 7.0),
>     - a lot of that information is things that only compiler developers would know about,
>     - has quite large overhead, I've seen it make compile times take 1.5x longer,
>     - has very little information about "frontend" part (preprocessing, parsing, instantiation, C++ modules),
>     - the things it reports are only "summaries", i.e. "how much time it took to do work X in total". e.g. it can tell that "inlining all functions took X seconds", but in case there was just one super slow function to inline among
>     thousands, it will not tell which one was the slow one.
> 
>     I have written a blog post about this (as well as lack of "good" time reporting tools in Visual Studio and gcc) recently, http://aras-p.info/blog/2019/01/12/Investigating-compile-times-and-Clang-ftime-report/
> 
>     At work (Unity game engine), with codebase of similar size to whole of Clang/LLVM (2-3 million lines of code), we had really good experience in adding timeline/flamegraph visualizations to various parts of our "build system". This can tell us which .cpp files were slowest to compile in the whole build, but I also wanted similar tooling for things "inside" single .cpp file compilation.
> 
>     Thus this attempt at adding a new time trace profiling mode.
> 
>     I have current changes on github here, https://github.com/aras-p/llvm-project-20170507/pull/2 -- can do a proper "patch" thing via Phabricator if needed.
> 
>     My current code change does not quite match Clang/LLVM code standards and probably needs some work, but the general approach seems to work. Already found one case of Clang being very slow at parsing some weird recursive macro thingamabob that we had; was causing about 5-8 seconds just to include one header file. I probably would have never found it without this type of visualization. Here it is very clear that among all the things, parsing just that one header file takes almost all the time: https://user-images.githubusercontent.com/348087/51038295-76efb780-15bb-11e9-926f-a6be1ffd03f1.png
> 
> 
>     Regards,
> 
>     -- 
>     Aras Pranckevičius
>     work: http://unity3d.com
>     home: http://aras-p.info
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> 
> 
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> 



More information about the cfe-dev mailing list