[llvm] r354764 - [XRay][tools] Revert "Use Support/JSON.h in llvm-xray convert"
Hans Wennborg via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 26 02:01:37 PST 2019
Merged to release_80 in r354856.
On Mon, Feb 25, 2019 at 8:38 AM Roman Lebedev via llvm-commits
<llvm-commits at lists.llvm.org> wrote:
>
> Author: lebedevri
> Date: Sun Feb 24 23:39:07 2019
> New Revision: 354764
>
> URL: http://llvm.org/viewvc/llvm-project?rev=354764&view=rev
> Log:
> [XRay][tools] Revert "Use Support/JSON.h in llvm-xray convert"
>
> Summary:
> This reverts D50129 / rL338834: [XRay][tools] Use Support/JSON.h in llvm-xray convert
>
> Abstractions are great.
> Readable code is great.
> JSON support library is a *good* idea.
>
> However unfortunately, there is an internal detail that one needs
> to be aware of in `llvm::json::Object` - it uses `llvm::DenseMap`.
> So for **every** `llvm::json::Object`, even if you only store a single `int`
> entry there, you pay the whole price of `llvm::DenseMap`.
>
> Unfortunately, it matters for `llvm-xray`.
>
> I was trying to analyse the `llvm-exegesis` analysis mode performance,
> and for that i wanted to view the LLVM X-Ray log visualization in Chrome
> trace viewer. And the `llvm-xray convert` is sluggish, and sometimes
> even ended up being killed by OOM.
>
> `xray-log.llvm-exegesis.lwZ0sT` was acquired from `llvm-exegesis`
> (compiled with ` -fxray-instruction-threshold=128`)
> analysis mode over `-benchmarks-file` with 10099 points (one full
> latency measurement set), with normal runtime of 0.387s.
>
> Timings:
> Old: (copied from D58580)
> ```
> $ perf stat -r 5 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT
>
> Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (5 runs):
>
> 21346.24 msec task-clock # 1.000 CPUs utilized ( +- 0.28% )
> 314 context-switches # 14.701 M/sec ( +- 59.13% )
> 1 cpu-migrations # 0.037 M/sec ( +-100.00% )
> 2181354 page-faults # 102191.251 M/sec ( +- 0.02% )
> 85477442102 cycles # 4004415.019 GHz ( +- 0.28% ) (83.33%)
> 14526427066 stalled-cycles-frontend # 16.99% frontend cycles idle ( +- 0.70% ) (83.33%)
> 32371533721 stalled-cycles-backend # 37.87% backend cycles idle ( +- 0.27% ) (33.34%)
> 67896890228 instructions # 0.79 insn per cycle
> # 0.48 stalled cycles per insn ( +- 0.03% ) (50.00%)
> 14592654840 branches # 683631198.653 M/sec ( +- 0.02% ) (66.67%)
> 212207534 branch-misses # 1.45% of all branches ( +- 0.94% ) (83.34%)
>
> 21.3502 +- 0.0585 seconds time elapsed ( +- 0.27% )
> ```
> New:
> ```
> $ perf stat -r 9 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT
>
> Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (9 runs):
>
> 7178.38 msec task-clock # 1.000 CPUs utilized ( +- 0.26% )
> 182 context-switches # 25.402 M/sec ( +- 28.84% )
> 0 cpu-migrations # 0.046 M/sec ( +- 70.71% )
> 33701 page-faults # 4694.994 M/sec ( +- 0.88% )
> 28761053971 cycles # 4006833.933 GHz ( +- 0.26% ) (83.32%)
> 2028297997 stalled-cycles-frontend # 7.05% frontend cycles idle ( +- 1.61% ) (83.32%)
> 10773154901 stalled-cycles-backend # 37.46% backend cycles idle ( +- 0.38% ) (33.36%)
> 36199132874 instructions # 1.26 insn per cycle
> # 0.30 stalled cycles per insn ( +- 0.03% ) (50.02%)
> 6434504227 branches # 896420204.421 M/sec ( +- 0.03% ) (66.68%)
> 73355176 branch-misses # 1.14% of all branches ( +- 1.46% ) (83.33%)
>
> 7.1807 +- 0.0190 seconds time elapsed ( +- 0.26% )
> ```
>
> So using `llvm::json` nearly triples run-time on that test case.
> (+3x is times, not percent.)
>
> Memory:
> Old:
> ```
> total runtime: 39.88s.
> bytes allocated in total (ignoring deallocations): 79.07GB (1.98GB/s)
> calls to allocation functions: 33267816 (834135/s)
> temporary memory allocations: 5832298 (146235/s)
> peak heap memory consumption: 9.21GB
> peak RSS (including heaptrack overhead): 147.98GB
> total memory leaked: 1.09MB
> ```
> New:
> ```
> total runtime: 17.42s.
> bytes allocated in total (ignoring deallocations): 5.12GB (293.86MB/s)
> calls to allocation functions: 21382982 (1227284/s)
> temporary memory allocations: 232858 (13364/s)
> peak heap memory consumption: 350.69MB
> peak RSS (including heaptrack overhead): 2.55GB
> total memory leaked: 79.95KB
> ```
> Diff:
> ```
> total runtime: -22.46s.
> bytes allocated in total (ignoring deallocations): -73.95GB (3.29GB/s)
> calls to allocation functions: -11884834 (529155/s)
> temporary memory allocations: -5599440 (249307/s)
> peak heap memory consumption: -8.86GB
> peak RSS (including heaptrack overhead): 0B
> total memory leaked: -1.01MB
> ```
> So using `llvm::json` increases *peak* memory consumption on *this* testcase ~+27x.
> And total allocation count +15x. Both of these numbers are times, *not* percent.
>
> And note that memory usage is clearly unbound with `llvm::json`, it directly depends
> on the length of the log, so peak memory consumption is always increasing.
> This isn't so with the dumb code, there is no accumulating memory consumption,
> peak memory consumption is fixed. Naturally, that means it will handle *much*
> larger logs without OOM'ing.
>
> Readability is good, but the price is simply unacceptable here.
> Too bad none of this analysis was done as part of the development/review D50129 itself.
>
> Reviewers: dberris, kpw, sammccall
>
> Reviewed By: dberris
>
> Subscribers: riccibruno, hans, courbet, jdoerfert, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D58584
>
> Modified:
> llvm/trunk/tools/llvm-xray/xray-converter.cpp
>
> Modified: llvm/trunk/tools/llvm-xray/xray-converter.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-xray/xray-converter.cpp?rev=354764&r1=354763&r2=354764&view=diff
> ==============================================================================
> --- llvm/trunk/tools/llvm-xray/xray-converter.cpp (original)
> +++ llvm/trunk/tools/llvm-xray/xray-converter.cpp Sun Feb 24 23:39:07 2019
> @@ -17,7 +17,6 @@
> #include "llvm/Support/EndianStream.h"
> #include "llvm/Support/FileSystem.h"
> #include "llvm/Support/FormatVariadic.h"
> -#include "llvm/Support/JSON.h"
> #include "llvm/Support/ScopedPrinter.h"
> #include "llvm/Support/YAMLTraits.h"
> #include "llvm/Support/raw_ostream.h"
> @@ -241,6 +240,31 @@ StackTrieNode *findOrCreateStackNode(
> return CurrentStack;
> }
>
> +void writeTraceViewerRecord(uint16_t Version, raw_ostream &OS, int32_t FuncId,
> + uint32_t TId, uint32_t PId, bool Symbolize,
> + const FuncIdConversionHelper &FuncIdHelper,
> + double EventTimestampUs,
> + const StackTrieNode &StackCursor,
> + StringRef FunctionPhenotype) {
> + OS << " ";
> + if (Version >= 3) {
> + OS << llvm::formatv(
> + R"({ "name" : "{0}", "ph" : "{1}", "tid" : "{2}", "pid" : "{3}", )"
> + R"("ts" : "{4:f4}", "sf" : "{5}" })",
> + (Symbolize ? FuncIdHelper.SymbolOrNumber(FuncId)
> + : llvm::to_string(FuncId)),
> + FunctionPhenotype, TId, PId, EventTimestampUs,
> + StackCursor.ExtraData.id);
> + } else {
> + OS << llvm::formatv(
> + R"({ "name" : "{0}", "ph" : "{1}", "tid" : "{2}", "pid" : "1", )"
> + R"("ts" : "{3:f3}", "sf" : "{4}" })",
> + (Symbolize ? FuncIdHelper.SymbolOrNumber(FuncId)
> + : llvm::to_string(FuncId)),
> + FunctionPhenotype, TId, EventTimestampUs, StackCursor.ExtraData.id);
> + }
> +}
> +
> } // namespace
>
> void TraceConverter::exportAsChromeTraceEventFormat(const Trace &Records,
> @@ -251,14 +275,18 @@ void TraceConverter::exportAsChromeTrace
>
> unsigned id_counter = 0;
>
> + OS << "{\n \"traceEvents\": [";
> DenseMap<uint32_t, StackTrieNode *> StackCursorByThreadId{};
> DenseMap<uint32_t, SmallVector<StackTrieNode *, 4>> StackRootsByThreadId{};
> DenseMap<unsigned, StackTrieNode *> StacksByStackId{};
> std::forward_list<StackTrieNode> NodeStore{};
> -
> - // Create a JSON Array which will hold all trace events.
> - json::Array TraceEvents;
> + int loop_count = 0;
> for (const auto &R : Records) {
> + if (loop_count++ == 0)
> + OS << "\n";
> + else
> + OS << ",\n";
> +
> // Chrome trace event format always wants data in micros.
> // CyclesPerMicro = CycleHertz / 10^6
> // TSC / CyclesPerMicro == TSC * 10^6 / CycleHertz == MicroTimestamp
> @@ -283,15 +311,8 @@ void TraceConverter::exportAsChromeTrace
> // type of B for begin or E for end, thread id, process id,
> // timestamp in microseconds, and a stack frame id. The ids are logged
> // in an id dictionary after the events.
> - TraceEvents.push_back(json::Object({
> - {"name", Symbolize ? FuncIdHelper.SymbolOrNumber(R.FuncId)
> - : llvm::to_string(R.FuncId)},
> - {"ph", "B"},
> - {"tid", llvm::to_string(R.TId)},
> - {"pid", llvm::to_string(Version >= 3 ? R.PId : 1)},
> - {"ts", llvm::formatv("{0:f4}", EventTimestampUs)},
> - {"sf", llvm::to_string(StackCursor->ExtraData.id)},
> - }));
> + writeTraceViewerRecord(Version, OS, R.FuncId, R.TId, R.PId, Symbolize,
> + FuncIdHelper, EventTimestampUs, *StackCursor, "B");
> break;
> case RecordTypes::EXIT:
> case RecordTypes::TAIL_EXIT:
> @@ -302,51 +323,43 @@ void TraceConverter::exportAsChromeTrace
> // (And/Or in loop termination below)
> StackTrieNode *PreviousCursor = nullptr;
> do {
> - TraceEvents.push_back(json::Object({
> - {"name", Symbolize
> - ? FuncIdHelper.SymbolOrNumber(StackCursor->FuncId)
> - : llvm::to_string(StackCursor->FuncId)},
> - {"ph", "E"},
> - {"tid", llvm::to_string(R.TId)},
> - {"pid", llvm::to_string(Version >= 3 ? R.PId : 1)},
> - {"ts", llvm::formatv("{0:f4}", EventTimestampUs)},
> - {"sf", llvm::to_string(StackCursor->ExtraData.id)},
> - }));
> + if (PreviousCursor != nullptr) {
> + OS << ",\n";
> + }
> + writeTraceViewerRecord(Version, OS, StackCursor->FuncId, R.TId, R.PId,
> + Symbolize, FuncIdHelper, EventTimestampUs,
> + *StackCursor, "E");
> PreviousCursor = StackCursor;
> StackCursor = StackCursor->Parent;
> } while (PreviousCursor->FuncId != R.FuncId && StackCursor != nullptr);
> break;
> }
> }
> + OS << "\n ],\n"; // Close the Trace Events array.
> + OS << " "
> + << "\"displayTimeUnit\": \"ns\",\n";
>
> // The stackFrames dictionary substantially reduces size of the output file by
> // avoiding repeating the entire call stack of function names for each entry.
> - json::Object StackFrames;
> - for (const auto &Stack : StacksByStackId) {
> - const auto &StackId = Stack.first;
> - const auto &StackFunctionNode = Stack.second;
> - json::Object::iterator It;
> - std::tie(It, std::ignore) = StackFrames.insert({
> - llvm::to_string(StackId),
> - json::Object{
> - {"name",
> - Symbolize ? FuncIdHelper.SymbolOrNumber(StackFunctionNode->FuncId)
> - : llvm::to_string(StackFunctionNode->FuncId)}},
> - });
> -
> - if (StackFunctionNode->Parent != nullptr)
> - It->second.getAsObject()->insert(
> - {"parent", llvm::to_string(StackFunctionNode->Parent->ExtraData.id)});
> + OS << R"( "stackFrames": {)";
> + int stack_frame_count = 0;
> + for (auto map_iter : StacksByStackId) {
> + if (stack_frame_count++ == 0)
> + OS << "\n";
> + else
> + OS << ",\n";
> + OS << " ";
> + OS << llvm::formatv(
> + R"("{0}" : { "name" : "{1}")", map_iter.first,
> + (Symbolize ? FuncIdHelper.SymbolOrNumber(map_iter.second->FuncId)
> + : llvm::to_string(map_iter.second->FuncId)));
> + if (map_iter.second->Parent != nullptr)
> + OS << llvm::formatv(R"(, "parent": "{0}")",
> + map_iter.second->Parent->ExtraData.id);
> + OS << " }";
> }
> -
> - json::Object TraceJSON{
> - {"displayTimeUnit", "ns"},
> - {"traceEvents", std::move(TraceEvents)},
> - {"stackFrames", std::move(StackFrames)},
> - };
> -
> - // Pretty-print the JSON using two spaces for indentations.
> - OS << formatv("{0:2}", json::Value(std::move(TraceJSON)));
> + OS << "\n }\n"; // Close the stack frames map.
> + OS << "}\n"; // Close the JSON entry.
> }
>
> namespace llvm {
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list