[cfe-dev] [RFC] Adding a different mode of "where clang spends time" reporting (timeline/flamegraph style)

Sam McCall via cfe-dev cfe-dev at lists.llvm.org
Tue Jan 15 03:27:27 PST 2019


(Sorry, hit send too early)

On Tue, Jan 15, 2019 at 12:20 PM Sam McCall <sammccall at google.com> wrote:
Hi Aras,

Nice work! Visualizing the timeline is a powerful tool, and I'd appreciate
having an easy way to annotate how spans of time are spent.
We also have a facility in clangd[1] to record traces and output them in
chrome tracing format, and I've also done some performance work with
similar tools internally.

Some random thoughts:
 - the distinction between tracing/profiling automatically (stack frames,
using xray or a profiler) and manually annotating ranges is an important
one - it may be possible to use XRay to combine the two.
 - the timeline view, flame graph, and graphviz representations are
independently useful, so being able to output something pprof can read is
nice
 - for multi-threaded programs, being able to represent events split across
or transferred between threads is important, but complicates the model
 - it's been useful to us to plug in other backends (e.g. internal
cross-server tracing tools)
 - the chrome tracing format is pretty awkward to work with: bulky, odd
semantics, doesn't survive truncation. It'd be nice to have something
better.

The clangd implementation probably isn't general purpose enough to use for
clang (particularly its approach to threading), but there might be some
useful ideas in there.

Cheers, Sam

[1]
https://reviews.llvm.org/source/clang-tools-extra/browse/clang-tools-extra/trunk/clangd/Trace.h

>
> On Sun, Jan 13, 2019 at 1:31 PM Aras Pranckevicius via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Hello!
>>
>> TL;DR: I have made a Clang/LLVM code change that adds "-ftime-trace"
>> option, that produces Chrome Tracing format output. Would like comments on
>> whether that is a good idea or not, or perhaps someone else is already
>> doing this. My current (WIP) patch in github PR format is here
>> https://github.com/aras-p/llvm-project-20170507/pull/2 -- with images
>> and trace output files attached.
>>
>> Longer version:
>>
>> Current implementation of "-ftime-report" has several issues,
>> particularly when I'm just a "user" of the compiler:
>>
>> - it outputs a lot of information (almost half of it is duplicated since
>> clang 7.0),
>> - a lot of that information is things that only compiler developers would
>> know about,
>> - has quite large overhead, I've seen it make compile times take 1.5x
>> longer,
>> - has very little information about "frontend" part (preprocessing,
>> parsing, instantiation, C++ modules),
>> - the things it reports are only "summaries", i.e. "how much time it took
>> to do work X in total". e.g. it can tell that "inlining all functions took
>> X seconds", but in case there was just one super slow function to inline
>> among
>> thousands, it will not tell which one was the slow one.
>>
>> I have written a blog post about this (as well as lack of "good" time
>> reporting tools in Visual Studio and gcc) recently,
>> http://aras-p.info/blog/2019/01/12/Investigating-compile-times-and-Clang-ftime-report/
>>
>> At work (Unity game engine), with codebase of similar size to whole of
>> Clang/LLVM (2-3 million lines of code), we had really good experience in
>> adding timeline/flamegraph visualizations to various parts of our "build
>> system". This can tell us which .cpp files were slowest to compile in the
>> whole build, but I also wanted similar tooling for things "inside" single
>> .cpp file compilation.
>>
>> Thus this attempt at adding a new time trace profiling mode.
>>
>> I have current changes on github here,
>> https://github.com/aras-p/llvm-project-20170507/pull/2 -- can do a
>> proper "patch" thing via Phabricator if needed.
>>
>> My current code change does not quite match Clang/LLVM code standards and
>> probably needs some work, but the general approach seems to work. Already
>> found one case of Clang being very slow at parsing some weird recursive
>> macro thingamabob that we had; was causing about 5-8 seconds just to
>> include one header file. I probably would have never found it without this
>> type of visualization. Here it is very clear that among all the things,
>> parsing just that one header file takes almost all the time:
>> https://user-images.githubusercontent.com/348087/51038295-76efb780-15bb-11e9-926f-a6be1ffd03f1.png
>>
>>
>> Regards,
>>
>> --
>> Aras Pranckevičius
>> work: http://unity3d.com
>> home: http://aras-p.info
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190115/e3d36fd7/attachment.html>


More information about the cfe-dev mailing list