[PATCH] D60609: Use native llvm JSON library for time profiler output

Tue Apr 16 10:46:48 PDT 2019

anton-afanasyev marked an inline comment as done.
anton-afanasyev added inline comments.

================
Comment at: llvm/lib/Support/TimeProfiler.cpp:83-86
+    json::Array Events;

     // Emit all events for the main flame graph.
     for (const auto &E : Entries) {
----------------
anton-afanasyev wrote:
> lebedev.ri wrote:
> > There is no `reserve()` function in `json::Array`, thus you end up doing the memory dance *every* entry. :/
> > (alloc larger, move, delete previous)
> > I suspect this accounts for **majority** of the extra cost, although not all of it.
> Good point to change in json lib.
I've measured the perf effect of adding `Array.reserve(33000)` (changed `JSON.h` a bit), it gives 3845 msec (vs 3995 msec) for `-time-trace-granularity=0`. Effect is noticeable, but actual implementation need a smart heuristics to determine the entries number to reserve. For the default option (time-trace-granularity = 500 msec) this doesn't matter.

================
Comment at: llvm/lib/Support/TimeProfiler.cpp:83-86
+    json::Array Events;

     // Emit all events for the main flame graph.
     for (const auto &E : Entries) {
----------------
lebedev.ri wrote:
> anton-afanasyev wrote:
> > anton-afanasyev wrote:
> > > lebedev.ri wrote:
> > > > There is no `reserve()` function in `json::Array`, thus you end up doing the memory dance *every* entry. :/
> > > > (alloc larger, move, delete previous)
> > > > I suspect this accounts for **majority** of the extra cost, although not all of it.
> > > Good point to change in json lib.
> > I've measured the perf effect of adding `Array.reserve(33000)` (changed `JSON.h` a bit), it gives 3845 msec (vs 3995 msec) for `-time-trace-granularity=0`. Effect is noticeable, but actual implementation need a smart heuristics to determine the entries number to reserve. For the default option (time-trace-granularity = 500 msec) this doesn't matter.
> If you don't intend to look into that (should be rather trivial),
> can you please at least raise a bug about this, with those perf numbers?
What do you mean by bug? Do you mean I should add `Array.reserve(N)` with `N` taken heuristically from different runs for different `time-trace-granularity`? Or would just `Array.reserve(4000)` be enough for default granularity?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60609/new/

https://reviews.llvm.org/D60609