[PATCH] D21987: [XRay] Initial XRay Function Call Accounting Tool

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 11 15:43:17 PDT 2016


dblaikie added a comment.

> I've added something which should explain a little what this tool already does at the moment. For the moment I'll hold off on adding more tests (I haven't figured out where the tests for the tools or parts of the tools go yet).


Tests generally go in with the functionality in the same commit - and tests for the <foo> tool live in the test/tools/<foo> directory. If some of this functionality ends up sufficiently factored out into utility libraries, unit tests may be written - though unit tests are pretty rare in LLVM (just culturally - I'm actually more of a fan of them) outside of the general data structures in ADT and some of Support. Mostly it's just end-to-end testing of specific tools, optimization passes, etc.


================
Comment at: tools/xray/xray-fc-account.cc:168-171
@@ +167,6 @@
+  // Find the section named "xray_instr_map".
+  StringRef Contents = "";
+  const auto &Sections = ObjectFile->getBinary()->sections();
+  auto I = llvm::find_if(Sections, [&](const object::SectionRef &Section) {
+    StringRef Name = "";
+    if (Section.getName(Name))
----------------
Helps a bit - but generally LLVM tools are platform agnostic. It sounds like this currently is not - that the tool would only work on the platform that generated the file (& that when we port xray to other platforms, again, the tool would only work on the desired platform)

Seems pretty likely someone might take a report from some buildbot that they don't have local hardware for, then want to analyze that on their x86 developer machine. So I think it's probably a good idea to ensure the tools are not platform specific like this. But perhaps I'm overengineering this?

================
Comment at: tools/xray/xray-fc-account.cc:237-267
@@ +236,33 @@
+
+  std::string FileLineAndColumn(int32_t FuncId) const {
+    auto It = FunctionAddresses.find(FuncId);
+    if (It == FunctionAddresses.end()) {
+      return "(unknown)";
+    }
+    std::ostringstream F;
+    auto ResOrErr = Symbolizer.symbolizeCode(BinaryInstrMap, It->second);
+    if (!ResOrErr || ResOrErr.get().FileName == "<invalid>")
+      return "(unknown)";
+    auto &DI = ResOrErr.get();
+    auto LastSlash = DI.FileName.find_last_of('/');
+    if (LastSlash != std::string::npos) {
+      F << DI.FileName.substr(LastSlash + 1);
+    } else {
+      F << DI.FileName;
+    }
+    F << ":" << DI.Line << ":" << DI.Column;
+    return F.str();
+  }
+};
+
+// We consolidate the data into a struct which we can output in various forms.
+struct ResultRow {
+  uint64_t Count;
+  double Min;
+  double Median;
+  double Pct90;
+  double Pct99;
+  double Max;
+  std::string DebugInfo;
+  std::string Function;
+};
----------------
I'm not sure whether a generic CSV-style output library would be that important/useful, but maybe I'm not quite picturing what you have in mind.

As for an output system - I'd probably skip the array of variants and use a variadic function template to take all the fields.

I was thinking something more high level/closer to the format you have in mind here, but perhpas I'm having trouble picturing exactly what the output formats are and how generic an API to them would look.

Perhaps there is a reasonable high level API for this that could abstract away the differences between CSV and pretty printed output that wouldn't be specific to the contents.

Test cases might make it easier to see what the final output formats look like & thus what some common API to produce that information might look like.



================
Comment at: tools/xray/xray-fc-account.cc:453
@@ +452,3 @@
+      auto &Timings = FT.second;
+      std::nth_element(Timings.begin(), Timings.begin(), Timings.end());
+      auto Min = Timings.front();
----------------
std::min_element?

================
Comment at: tools/xray/xray-fc-account.cc:453-471
@@ +452,21 @@
+      auto &Timings = FT.second;
+      std::nth_element(Timings.begin(), Timings.begin(), Timings.end());
+      auto Min = Timings.front();
+      auto MedianOff = Timings.size() / 2;
+      std::nth_element(Timings.begin(), Timings.begin() + MedianOff,
+                       Timings.end());
+      auto Median = Timings[MedianOff];
+      auto Pct90Off = std::floor(Timings.size() * 0.9);
+      std::nth_element(Timings.begin(), Timings.begin() + Pct90Off,
+                       Timings.end());
+      auto Pct90 = Timings[Pct90Off];
+      auto Pct99Off = std::floor(Timings.size() * 0.99);
+      std::nth_element(Timings.begin(), Timings.begin() + Pct99Off,
+                       Timings.end());
+      auto Pct99 = Timings[Pct99Off];
+      std::nth_element(Timings.begin(), Timings.begin() + Timings.size(),
+                       Timings.end());
+      auto Max = Timings.back();
+      ResultRow Row;
+      Row.Count = Timings.size();
+      Row.Min = Min;
----------------
dblaikie wrote:
> std::min_element?
Reckon it'd be easier to just sort this then pick out the elements you want? Or are all these partial orderings better?

================
Comment at: tools/xray/xray-fc-account.cc:467
@@ +466,3 @@
+      auto Pct99 = Timings[Pct99Off];
+      std::nth_element(Timings.begin(), Timings.begin() + Timings.size(),
+                       Timings.end());
----------------
Timings.begin() + Timings.size() == Timings.end() ?

And then it's probably just std::max_element rather than std::nth_element?

================
Comment at: tools/xray/xray-fc-account.cc:470-476
@@ +469,9 @@
+      auto Max = Timings.back();
+      ResultRow Row;
+      Row.Count = Timings.size();
+      Row.Min = Min;
+      Row.Median = Median;
+      Row.Pct90 = Pct90;
+      Row.Pct99 = Pct99;
+      Row.Max = Max;
+      if (HasCyclesPerSec) {
----------------
Maybe just move the declaration of Row up and initialize all these members directly so you don't need the extra/duplicate variables?

================
Comment at: tools/xray/xray-fc-account.cc:509-516
@@ +508,10 @@
+               << ")\t";
+      if (OutputFormat == OutputFormats::CSV)
+        outs() << Thread.first << ',' << MinMax.first << ',' << MinMax.second
+               << ',';
+      if (HasCyclesPerSec) {
+        outs() << (MinMax.second - MinMax.first) / CyclesPerSec;
+      } else {
+        outs() << (MinMax.second - MinMax.first);
+      }
+      outs() << '\n';
----------------
Inconsistent use of {} on single line statements (I don't think there's a convention to put {} on if/else but not on if - some people put {} on all parts of an if/else chain if any one part needs them, but that's not the case here)

================
Comment at: tools/xray/xray-fc-account.cc:521
@@ +520,3 @@
+
+  // Print out CPU start/end ranges and durations.
+  if (!CPUMinMaxTSC.empty()) {
----------------
I'd consider breaking each of these blocks (there are several top level blocks in main, each with a comment at the start) of code out into a separate function for readability? Comment -> function name, ideally


https://reviews.llvm.org/D21987





More information about the llvm-commits mailing list