[Lldb-commits] [PATCH] D126267: [trace][intelpt] Support system-wide tracing [13] - Add context switch decoding

Jakob Johnson via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Fri May 27 12:36:59 PDT 2022


jj10306 requested changes to this revision.
jj10306 added inline comments.
This revision now requires changes to proceed.


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:22-42
+struct PerfContextSwitchRecord {
+  struct perf_event_header header;
+  uint32_t next_prev_pid;
+  uint32_t next_prev_tid;
+  uint32_t pid, tid;
+  uint64_t time_in_nanos;
+
----------------
should these structures live in `Perf.h`?


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:59-102
+ThreadContinuousExecution ThreadContinuousExecution::CreateCompleteExecution(
+    lldb::core_id_t core_id, lldb::tid_t tid, uint64_t start, uint64_t end) {
+  ThreadContinuousExecution o(core_id, tid);
+  o.variant = Variant::Complete;
+  o.tscs.complete.start = start;
+  o.tscs.complete.end = end;
+  return o;
----------------
nit: if you switch the union for two optional values you can remove a lot of the redundancy between these methods.


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:136-141
+  if (record.IsIn() && prev.IsIn()) {
+    // We found two consecutive ins, which means that we didn't capture
+    // the end of the previous execution.
+    on_new_thread_execution(ThreadContinuousExecution::CreateHintedEndExecution(
+        core_id, prev.tid, prev.tsc, record.tsc - 1));
+  } else if (record.IsOut() && prev.IsOut()) {
----------------
can you help me understand how these two cases could happen? they seem fundamentally impossible given the nature of context switching - shouldn't all "continuous" execution be disjoint?

My current understanding is as follows:

Expected:
i1 o1 i4 o4 i9 o9

Impossible:
and this is not possible:
i1 i4 o4 o1  9 o9

Let me know if I'm missing something 🙂



================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:146
+        ThreadContinuousExecution::CreateHintedStartExecution(
+            core_id, record.tid, prev.tsc + 1, record.tsc));
+  } else if (record.IsOut() && prev.IsIn()) {
----------------
why the + 1?


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:181
+///   the trace.
+static Error DecodePerfContextSwitchTrace(
+    ArrayRef<uint8_t> data, core_id_t core_id,
----------------
Do you think any of the general perf logic related to "decoding" the records should be moved to `Perf.h/cpp`?


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:196
+    const PerfContextSwitchRecord &perf_record =
+        *reinterpret_cast<const PerfContextSwitchRecord *>(data.data() +
+                                                           offset);
----------------
nit: it feels weird casting to `PerfContextSwitchRecord` when we don't yet know if this is actually a context switch event without first looking at the header.
casting to `perf_event_header` and checking that first before interpreting the record as a context switch record seems like a better approach.
Given that currently only intelpt is using the LLDB's perf "library" this isn't a big deal, but if we wanted to make it more complete/robust, we should revisit this and improve our record handling design so it could be easily extended to support any record types.


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:198
+                                                           offset);
+    // A record of 1000 uint64_t's or more should mean that the data is wrong
+    if (perf_record.header.size == 0 ||
----------------
can you link the documentation that states this?


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:199-200
+    // A record of 1000 uint64_t's or more should mean that the data is wrong
+    if (perf_record.header.size == 0 ||
+        perf_record.header.size > sizeof(uint64_t) * 1000)
+      return CreateError(offset, formatv("A record of {0} bytes was found.",
----------------
using sizeof on uint64_t feels weird since the typename already implies the name. I think moving this value to a constant and explaining its value would make things cleaner.


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:204-205
+
+    // We add + 100 to this record because some systems might have custom
+    // records. In any case, we are looking only for abnormal data.
+    if (perf_record.header.type >= PERF_RECORD_MAX + 100)
----------------
same as above, can we link the docs. alternatively, link the docs at the top of the function or in the header and then reference that link at the appropriate spots in the code


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp:263-272
+  auto do_decode = [&]() -> Error {
+    // We'll decode all context switch traces, identify continuous executions
+    // and group them by thread.
+    for (core_id_t core_id : m_cores) {
+      Error err = m_trace.OnCoreBinaryDataRead(
+          core_id, IntelPTDataKinds::kPerfContextSwitchTrace,
+          [&](ArrayRef<uint8_t> data) -> Error {
----------------
this is lambda inception 😆


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.h:31-33
+    HintedStart,
+    /// The start is known and we have a guess for the end
+    HintedEnd,
----------------
Where do these "guesses" come from?


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.h:40-61
+  union {
+    struct {
+      uint64_t start;
+      uint64_t end;
+    } complete;
+    struct {
+      uint64_t start;
----------------
what about just having two optionals fields to remove the "dangers" that unions introduce? Then the Variant enum can be used to guide whether the start and end values should be non-null just as it's being used to access the union currently


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.h:105-107
+/// each Intel PT decoded instruction and in which order. It also assumes that
+/// the Intel PT data and context switches might have gaps in their traces due
+/// to contention or race conditions.
----------------
what is meant by "contention" here? Is this referring to the ipt aux buffer wrapping?


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.h:111
+  /// \param[in] core_ids
+  ///   The list of cores where the traced programs were running on.
+  ///
----------------
is this just the cores that the program ran on or all possible cores the process could have run on?


================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.h:155
+  /// of the decoder.
+  llvm::Optional<std::string> m_setup_error;
+};
----------------
why is this needed?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126267/new/

https://reviews.llvm.org/D126267



More information about the lldb-commits mailing list