[PATCH] D21987: [XRay] Initial XRay Function Call Accounting Tool

Dean Michael Berris via llvm-commits llvm-commits at lists.llvm.org
Sun Aug 7 23:07:38 PDT 2016


dberris added a comment.

> I'm certainly generally in favor of separation and unit tests, though I perhaps don't understand the high level features of the tool as-is, or what it might grow, to have a good sense of where the points of separation and testing might occur. Perhaps a more detailed explanation and test cases would help explain this further.


Right, I think this file needs a few more comments at the top to describe what this is meant to do. :)

I've added something which should explain a little what this tool already does at the moment. For the moment I'll hold off on adding more tests (I haven't figured out where the tests for the tools or parts of the tools go yet).


================
Comment at: tools/xray/xray-fc-account.cc:167-170
@@ +166,6 @@
+  // the runtime, but that's a design point we can discuss later on. For now, we
+  // replicate the logic and move on.
+  int32_t FuncId = 1;
+  uint64_t CurFn = 0;
+  for (const auto &Sled : Sleds) {
+    auto F = Sled.Function;
----------------
> If these are just standard layout structs (which they seem to be), you could allocate an array of them with 'new XRaySledEntry[N]' as usual. Though I think you could get aligned memory by new'ing a std::aligned_storage object... maybe?

Unfortunately, neither of those are guaranteed to be honoured by the standard allocator (or even the default implementation of 'new') in terms of returning an aligned memory location. :(

What's happening here really is that, since the data is a global, defined by the compiler (when we emit the XRay sleds) we have an opportunity to align the data then. I suspect that has to be fixed first for this to work predictably (i.e. make each entry in the sled be always aligned to 32-byte boundaries). Let me send a change to make that happen, so that this would "just work".

> But this'll still only work to run the tool on the same architecture, layout, etc, as the original - is that an intentional limitation? Or should we be making a more fixed/deliberate choice about the file format and having writers format it appropriately, and readers deserialize it appropriately?

This is currently relying on ELF sections for the data. I suspect that's fine for now, but later on a more flexible representation of the sleds would be required (for example, if the instrumentation map was encoded as json or some other format that's easier to consume with other tools).

================
Comment at: tools/xray/xray-fc-account.cc:167-170
@@ +166,6 @@
+  // the runtime, but that's a design point we can discuss later on. For now, we
+  // replicate the logic and move on.
+  int32_t FuncId = 1;
+  uint64_t CurFn = 0;
+  for (const auto &Sled : Sleds) {
+    auto F = Sled.Function;
----------------
dberris wrote:
> > If these are just standard layout structs (which they seem to be), you could allocate an array of them with 'new XRaySledEntry[N]' as usual. Though I think you could get aligned memory by new'ing a std::aligned_storage object... maybe?
> 
> Unfortunately, neither of those are guaranteed to be honoured by the standard allocator (or even the default implementation of 'new') in terms of returning an aligned memory location. :(
> 
> What's happening here really is that, since the data is a global, defined by the compiler (when we emit the XRay sleds) we have an opportunity to align the data then. I suspect that has to be fixed first for this to work predictably (i.e. make each entry in the sled be always aligned to 32-byte boundaries). Let me send a change to make that happen, so that this would "just work".
> 
> > But this'll still only work to run the tool on the same architecture, layout, etc, as the original - is that an intentional limitation? Or should we be making a more fixed/deliberate choice about the file format and having writers format it appropriately, and readers deserialize it appropriately?
> 
> This is currently relying on ELF sections for the data. I suspect that's fine for now, but later on a more flexible representation of the sleds would be required (for example, if the instrumentation map was encoded as json or some other format that's easier to consume with other tools).
new'ing a std::aligned_storage object doesn't seem to work. :( C++17 fixes this by being able to provide an alignment target to new, but we can't use that yet (also, alignment-aware allocators aren't a thing yet AFAICT either).

What will work is asking for 2 x sizeof(T) then manually aligning the address and then inplace new'ing. That might be a useful little utility, but I don't think we need to do that here (yet).

Note that this is loaded from the ELF file section using the reader. I don't know yet whether the alignment enforced by the compiler/linker will translate to aligned memory pointers -- though if they're being mmapped then they _should_ be.

For now we're assuming that we're reading the ELF section and that there's an analog to this in other formats (COFF, MachO, etc.) that LLVM supports.

Does this help?

================
Comment at: tools/xray/xray-fc-account.cc:236-266
@@ +235,33 @@
+
+// We consolidate the data into a struct which we can output in various forms.
+struct ResultRow {
+  uint64_t Count;
+  double Min;
+  double Median;
+  double Pct90;
+  double Pct99;
+  double Max;
+  std::string DebugInfo;
+  std::string Function;
+};
+
+void PrintStatsCSV(const ResultRow &Row) {
+  outs() << Row.Count << ',' << Row.Min << ',' << Row.Median << ',' << Row.Pct90
+         << ',' << Row.Pct99 << ',' << Row.Max << ",\"" << Row.DebugInfo
+         << "\"," << '"' << Row.Function << "\"\n";
+}
+
+void PrintStatsTEXT(const ResultRow &Row) {
+  outs() << Row.Count << "\t"
+         << "[" << Row.Min << ", " << Row.Median << ", " << Row.Pct90 << ", "
+         << Row.Pct99 << ", " << Row.Max << "] " << Row.DebugInfo << "\t"
+         << Row.Function << "\n";
+}
+
+void PrintStatsHeader(OutputFormats OF) {
+  if (OF == OutputFormats::TEXT)
+    outs() << "count\t[min,median,90%ile,99%ile,max]\tdebug\tfunction\n";
+  if (OF == OutputFormats::CSV)
+    outs() << "count,min,median,90%ile,99%ile,max,debuginfo,function\n";
+}
+
----------------
dblaikie wrote:
> I was thinking of something more aggressive - factoring out /all/ of the "if (TEXT)/if (CSV)" code, or at least most of it, from the main part of the program, not just these two places. Though I'm open to hearing that that's not practical/useful, for sure - I'm just looking at the high level & seeing this pivot and wondering if it could be streamlined/separated a bit.
I think it's better if there was a library for CSV and structured table output if that's going to happen a lot. Although I think it's worth crossing that bridge later. If you have suggestions on some useful data structures like the analog of a std::variant (to support `std::array<std::variant<int, string, float, double>, N>` for outputting rows) in LLVM, I'd love to write up a simple table output formatting utility.

I can foresee at least one more analysis/conversion where we turn the data available from the traces into something that's good to deal with in some spreadsheet application, so making this easier and more flexible in LLVM at least will be very useful.

Any suggestions on:

1) Where should something like that live? lib/Support/ ? or lib/ADT/ ?
2) What kind of interface would be preferred for something like that?

================
Comment at: tools/xray/xray-fc-account.cc:265
@@ +264,3 @@
+  if (OF == OutputFormats::CSV)
+    outs() << "count,min,median,90%ile,99%ile,max,debuginfo,function\n";
+}
----------------
Yeah, this turns out to be much simpler if the body of the loop was empty. Thanks. :)


https://reviews.llvm.org/D21987





More information about the llvm-commits mailing list