[Lldb-commits] [PATCH] D122603: [wip][intelpt] Refactor timestamps out of `IntelPTInstruction`

Mon Mar 28 12:16:14 PDT 2022

wallace requested changes to this revision.
wallace added inline comments.
This revision now requires changes to proceed.

================
Comment at: lldb/source/Plugins/Trace/intel-pt/DecodedThread.cpp:94-96
+  auto last_ts = m_instruction_timestamps[m_instruction_timestamps.size() - 1];
+  if (last_ts != time)
+    m_instruction_timestamps.emplace(m_instructions.size() - 1, time);
----------------
doing [] is O(logn), and we want to be faster than this.

You can do the following which is O(1)

  auto it = m_instruction_timestamps.end()
  if (it != m_instruction_timestamps.begin()) {
    it--;
    if (it->second != tsc) {
      // this tsc is not the same!
     m_instruction_timestamps.insert(insn_idx, tsc);
    } else {
      // this tsc is the same, do nothing
    }
  }

you can further optimize this by storing the last tsc that has been appended, that way you don't even need to create iterators

================
Comment at: lldb/source/Plugins/Trace/intel-pt/DecodedThread.cpp:109-112
+  auto insn_ts = m_instruction_timestamps.find(index);
+  if (!m_instruction_timestamps.empty() && insn_ts != m_instruction_timestamps.end())
+    return insn_ts->second;
+  return 0;
----------------
this is wrong, you need to use upper_bound - 1, like this:

  if (m_instruction_timestamps.empty())
    return None;

  auto it = m_instruction_timestamps.upper_bound(insn_idx);
  if (it == m_instruction_timestamps.begin())
    return None;

  --it;
  return it->second;

this will allow you to go to the largest index that is <= than insn_idx

================
Comment at: lldb/source/Plugins/Trace/intel-pt/DecodedThread.h:141-142

+  /// Get timestamp of an instruction by its index.
+  uint64_t GetInstructionTimestamp(size_t index) const;
+
----------------
wallace wrote:
> This has to be an optional because it might not be present
better use insn_idx instead of index for all these variables

================
Comment at: lldb/source/Plugins/Trace/intel-pt/DecodedThread.h:144-148
+  /// Check if the instruction at a given index was an error. 
+  /// -- Reasoning (this will be removed before committing)
+  /// This is faster and less wasteful of memory than creating an ArrayRef 
+  /// every time that you need to check this, with GetInstructions()[i].IsError()
+  bool GetInstructionIsError(size_t insn_idx) const;
----------------
I actually like this, let's improve it

================
Comment at: lldb/source/Plugins/Trace/intel-pt/IntelPTDecoder.cpp:153-154
       } else {
-        decoded_thread_sp->AppendInstruction(insn, time);
+        decoded_thread_sp->AppendInstruction(insn);
+        decoded_thread_sp->AppendInstructionTimestamp(time);
       }
----------------

================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceCursorIntelPT.cpp:25
          "a trace should have at least one instruction or error");
   m_pos = m_decoded_thread_sp->GetInstructions().size() - 1;
 }
----------------
here you need to set the correct value of m_current_tsc

================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceCursorIntelPT.cpp:43-44
     m_pos += IsForwards() ? 1 : -1;
+    if (uint64_t ts = m_decoded_thread_sp->GetInstructionTimestamp(m_pos) != 0)
+      m_currentts = ts;
     if (!m_ignore_errors && IsError())
----------------
this will be O(logn). We can do better if m_current_tsc is the following little structure

class DecodedThread {
  struct TscRange {
    size_t start_index;
    size_t end_index;
    size_t tsc;
    std::map<size_t, uint64_t>::iterator prev;
    std::map<size_t, uint64_t>::iterator next;
  };
}
  Optional<TscRange> m_current_tsc;

Then you can ask the new method `Optional<TscRange> DecodedThread::GetTSCRange(size_t insn_index)` which will give you the entire range of the tsc that covers insn_index. With these numbers, you can do a comparison in this line to very quickly move from TSC to TSC only when needed.

You can also have the method `DecodedThread::GetNextTscRange(const TscRange& range)` that computes in O(1) the next range, and you can similarly have GetPrevTscRange()`. The iterators will help you do that withing using lower_bound, which is O(1)

================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceCursorIntelPT.cpp:75
     return std::abs(dist);
   }
 }
----------------
you need to calculate the new tsc_range after moving m_pos

================
Comment at: lldb/source/Plugins/Trace/intel-pt/TraceCursorIntelPT.h:45-46
   size_t m_pos;
+  /// Current instruction timestamp.
+  uint64_t m_currentts;
 };
----------------
Optional<uint64_t> m_current_tsc;

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122603/new/

https://reviews.llvm.org/D122603