[PATCH] D60609: Use native llvm JSON library for time profiler output

Mon Apr 15 05:36:08 PDT 2019

lebedev.ri resigned from this revision.
lebedev.ri added a comment.
This revision is now accepted and ready to land.

In D60609#1466384 <https://reviews.llvm.org/D60609#1466384>, @anton-afanasyev wrote:

> In D60609#1465351 <https://reviews.llvm.org/D60609#1465351>, @lebedev.ri wrote:
>
> > In D60609#1465327 <https://reviews.llvm.org/D60609#1465327>, @takuto.ikuta wrote:
> >
> > > I think it is true that json library is slower than just naive json output.
> > >  But this code run one time in a compile and should have relatively few json objects.
> > >  So visible performance effect will be low. Taking stats sounds good and I'm surprised if this patch change clang performance match.
> >
> >
> > Yes, that is my hope too, but given history i'd like to be confident :)
> >  This will really depend on the number of the entries in the output.
> >  I'd like to see perf measurements on some large compiler invocation, with *many* of these entries in output json report.
>
>
> I've measured time performance for the large source file `test-suite/SingleSource/Benchmarks/Misc-C++-EH/spirit.cpp`.

Thank you for doing the measurements.

> Five tests: without time tracing, with time tracing for old/new json output, with small granularity for old/new json output.

> Small time granularity leads to many entries in json output (~32K vs ~2K for default granularity).

It would be interested to see memory consumption changes (heaptrack <https://github.com/KDE/heaptrack>),
but with this small maximal number of entries, pretty much anything //should// not be too noticeable.

> As one can see, json library is actually slower but it has no notable effect by default and I believe it's eligible price even for large number of entries.
> 
>   perf stat -r 5 ./build_rel_base/bin/clang++ -w -S spirit.cpp
>   ...
>   3362.032795      task-clock (msec) 
> 
> 
>   perf stat -r 5 ./build_rel_base/bin/clang++ -w -S -ftime-trace spirit.cpp
>   ...
>   3577.592945      task-clock (msec)

+215 ms or +~6% (as compared to without -ftime-trace)

>   perf stat -r 5 ./build_rel_exp/bin/clang++ -w -S -ftime-trace spirit.cpp
>   ...
>   3582.747398      task-clock (msec)

+220 ms or +~7% (as compared to without -ftime-trace, or +5ms/0.1% as compared to without this patch)

>   perf stat -r 5 ./build_rel_base/bin/clang++ -w -S -ftime-trace -mllvm -time-trace-granularity=0 spirit.cpp
>   ...
>   3634.623793      task-clock (msec)

+273 ms or +~8% (as compared to without -ftime-trace)

>   perf stat -r 5 ./build_rel_exp/bin/clang++ -w -S -ftime-trace -mllvm -time-trace-granularity=0 spirit.cpp
>   ...
>   3995.315749      task-clock (msec)

+733 ms or +19% (as compared to without -ftime-trace, or +360ms/+10% as compared to without this patch)

Okay, so the price increase in default config is reasonably small.
It is **noticeably** larger with bigger number of events though.
I don't like how non-thin of a wrapper `"llvm/Support/JSON.h"` is. :/

TLDR: Not horrible, but above what i would consider a reasonable price for abstraction.

================
Comment at: llvm/lib/Support/TimeProfiler.cpp:83-86
+    json::Array Events;

     // Emit all events for the main flame graph.
     for (const auto &E : Entries) {
----------------
There is no `reserve()` function in `json::Array`, thus you end up doing the memory dance *every* entry. :/
(alloc larger, move, delete previous)
I suspect this accounts for **majority** of the extra cost, although not all of it.

================
Comment at: llvm/lib/Support/TimeProfiler.cpp:90
+
+      Events.push_back(json::Object{
+          {"pid", 1},
----------------
`emplace_back()` doesn't work?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60609/new/

https://reviews.llvm.org/D60609