[Lldb-commits] [PATCH] D103588: [trace] Create a top-level instruction class

Tue Jun 8 15:10:00 PDT 2021

wallace added a comment.

> This approach - of prototyping trace analyses on a vector<TraceInstruction> representation and then rewriting them later when scaling issues arise - doesn't sound good to me. Even for something simple, like finding a backtrace, the window of instructions needed for analysis can easily grow to a point where the full vector<TraceInstruction> can't be maintained in memory.

It's worth noticing that the vector<TraceInstruction> is internal to the Intel PT plugin. It's not exposed at the Trace.h interface level, so any consumers can't make any assumptions on how the data is stored. We will improve the intel pt plugin itself as we make progress on this, but we are ensuring that the interfaces are correct for the future.

> To clarify: I think that's a perfectly reasonable way to prototype trace analyses, but I don't think that kind of prototyping should be done at the SB API level, as there are extra bincompat & portability concerns here.

I know, and I want to stress out that we are not exposing any of this instruction information at the SB API level and probably we won't for a while, i.e. I don't plan on creating a SBTraceInstruction class, as there's too much boilerplate needed to expose an SBInstruction object. For now, we want to have some code that processes instructions and that exists at the Trace.h level, outside of the intel-pt plugin, as we want to make it generic enough to be utilized by different tracing technologies.

> It's not at all clear to me that lldb needs to surface a generic TraceInstruction object, with metadata like m_byte_size, m_inst_type, etc. stored inline. It seems more flexible to instead vend a cursor (just an integer) that indicates a position in the trace; a separate set of APIs could answer queries given a cursor, e.g. 'getInstTypeAtCursor(u64 cursor)'.

There's an advantage. Decoding intel-pt instructions is very slow and we are able to produce correct indices only if we decode the raw trace sequentially, either backwards (for negative indices) or forwards (for positive indices). If we start from the middle, we don't know which instruction we are at. If we allow the user to query properties of arbitrary indices, we might need to decode a big prefix or a big suffix of the trace until we get to that index. It seems safer to me to expose right away the properties of these instructions as soon as they are decoded, so that we don't need any further decoding if the user invokes a query method with a possibly wrong index.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103588/new/

https://reviews.llvm.org/D103588