[Lldb-commits] [PATCH] D103588: [trace] Create a top-level instruction class

Tue Jun 8 17:07:20 PDT 2021

vsk added a comment.

In D103588#2806611 <https://reviews.llvm.org/D103588#2806611>, @wallace wrote:

>> This approach - of prototyping trace analyses on a vector<TraceInstruction> representation and then rewriting them later when scaling issues arise - doesn't sound good to me. Even for something simple, like finding a backtrace, the window of instructions needed for analysis can easily grow to a point where the full vector<TraceInstruction> can't be maintained in memory.
>
> It's worth noticing that the vector<TraceInstruction> is internal to the Intel PT plugin. It's not exposed at the Trace.h interface level, so any consumers can't make any assumptions on how the data is stored. We will improve the intel pt plugin itself as we make progress on this, but we are ensuring that the interfaces are correct for the future.

This doesn't quite alleviate my concern. We'd like for the generic trace infrastructure (outside of the PT plugin) to support different tracing technologies as well. It's not clear that the TraceInstruction concept can do that.

Two questions that have remained unanswered here: (1) how would TraceInstruction be used by non-PT plugins? (2) how do we guarantee the generic analyses you're planning to write will scale?

>> To clarify: I think that's a perfectly reasonable way to prototype trace analyses, but I don't think that kind of prototyping should be done at the SB API level, as there are extra bincompat & portability concerns here.
>
> I know, and I want to stress out that we are not exposing any of this instruction information at the SB API level and probably we won't for a while, i.e. I don't plan on creating a SBTraceInstruction class, as there's too much boilerplate needed to expose an SBInstruction object. For now, we want to have some code that processes instructions and that exists at the Trace.h level, outside of the intel-pt plugin, as we want to make it generic enough to be utilized by different tracing technologies.

lldb's SB interfaces have always been relatively thin wrappers around classes in include/lldb (SBTarget -> Target, SBProcess -> Process, etc.). So it's really worth spending the extra time to get the generic interfaces right, even if they're not SB API yet, since that's ultimately the direction in which we'll want to go.

>> It's not at all clear to me that lldb needs to surface a generic TraceInstruction object, with metadata like m_byte_size, m_inst_type, etc. stored inline. It seems more flexible to instead vend a cursor (just an integer) that indicates a position in the trace; a separate set of APIs could answer queries given a cursor, e.g. 'getInstTypeAtCursor(u64 cursor)'.
>
> There's an advantage. Decoding intel-pt instructions is very slow and we are able to produce correct indices only if we decode the raw trace sequentially, either backwards (for negative indices) or forwards (for positive indices). If we start from the middle, we don't know which instruction we are at. If we allow the user to query properties of arbitrary indices, we might need to decode a big prefix or a big suffix of the trace until we get to that index. It seems safer to me to expose right away the properties of these instructions as soon as they are decoded, so that we don't need any further decoding if the user invokes a query method with a possibly wrong index.

I don't see how the fact that PT decoding is slow necessitates the TraceInstruction representation. I don't find that the issues you're ascribing to a cursor representation are inevitable, or all that different from the issues you'd run into with generating a large number of TraceInstructions for a subset of a trace (also not clear how this subset would be specified).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103588/new/

https://reviews.llvm.org/D103588