[PATCH] D43578: -ftime-report switch support in Clang
Richard Smith - zygoloid via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Sun Apr 15 17:19:51 PDT 2018
rsmith added a comment.
In https://reviews.llvm.org/D43578#1068127, @mzolotukhin wrote:
> > Who is the audience for this information?
> > What information do they want from a time report?
> > How do we present that information in a way that's not misleading (given Clang's architecture)?
> I would find the timers extremely useful. Even if they overlap, they still would 1) be a good indicator of a newly introduced problem and 2) give a rough idea where frontend time is spent. I agree that it wouldn't be super-accurate, but the numbers we usually operate are quite high (e.g. <some part> started to take 1.5x time). When we decide to investigate it deeper, we can use more accurate tools, such as profilers.
What kinds of <some part> would be useful to you? (How do you want the runtime of Clang broken down?) Are vertical slices (through all Clang's various layers and potentially parts of LLVM) -- as this patch will produce -- useful, or would you really want horizontal slices (as in, what part of Clang is actually spending the time)? Or just anything that's basically expected to be consistent from run to run, so you can identify that //something// has slowed down, even if you don't have enough information to really know what?
> All that said, if we can make timers more accurate/not-overlapping, that surely would be helpful as well.
I think we need to fix the overlap issue as a prerequisite to adding timers with large amounts of overlap, especially self-overlap. Otherwise the numbers will likely do more harm than good, as they will significantly misattribute runtime. Fortunately, I think that should only require some relatively small changes to the LLVM timer infrastructure.
>> Can we deliver useful value compared to a modern, dedicated profiling tool?
> Having timers, especially producing results in the same format, as existing LLVM timers, would be much more convenient than using profilers. With timers I can simply add a compile-time flag to my test-suite run and get all the numbers in the end of my usual run. With profilers it's a bit more complicated.
Well, that depends on the profiler. With some profilers, you can just attach a profiler to your entire compilation and then ask it what the hottest functions were. But sure, if you have existing infrastructure built around `-ftime-report`, I can see that it makes sense to reuse that infrastructure.
> Speaking of timers overlapping and mutual calling: LLVM timers also have this problem, and e.g. if there is problem is in some analysis (say, ScalarEvolution), we see indications in several other passes using this analysis (say, IndVarSimplify and LoopStrengthReduction). While it does not immediately show the problematic spot, it gives you pretty strong hints to where to look at first. So, having even not-perfect timers is still useful.
While LLVM may have some overlap between regions, the vertical timing slices are still pretty well aligned with the horizontal functionality slices. I expect the problems in Clang to be a //lot// worse. Suppose you enter N levels of parsing templates, and then trigger a whole bunch of template instantiation, AST deserialization, and code generation. Let's say that takes 1s in total. With this patch, I think we'd end up attributing Ns of compile time to "parsing templates", which is clearly very far from the truth, but will likely be listed as (by far) the slowest thing in the compilation. This does not even rise to the level of "not-perfect", it's going to be actively misleading in a lot of cases, and won't even necessarily point anywhere near the problematic spot.
I think with this patch and no fix to the overlap problem, we will find ourselves frequently fielding bugs where people say "X is slow" when it actually isn't. Plus I think we have the opportunity to deliver something that's hugely useful and not much harder to achieve (providing timing information that relates back to the source code), and I'd prefer we spend our complexity budget for this feature there.
> Also, I apologize for LGTMing the patch earlier while it was not properly reviewed - I didn't notice it didn't have cfe-commits in subscribers, and it had been waiting for a review for quite some time (so I assumed that all interested parties had their chance to look at it).
No problem, these things happen :) And thank you for your feedback, it's very useful.
More information about the cfe-commits