[lldb-dev] [Reproducers] SBReproducer RFC

Mon Jan 7 13:13:40 PST 2019

On Mon, Jan 7, 2019 at 3:52 AM Tamas Berghammer <tberghammer at google.com>
wrote:

> Thanks Pavel for looping me in. I haven't looked into the actual
> implementation of the prototype yet but reading your description I have
> some concern regarding the amount of data you capture as I feel it isn't
> sufficient to reproduce a set of usecases.
>

Thanks Tamas!

> One problem is when the behavior of LLDB is not deterministic for whatever
> reason (e.g. multi threading, unordered maps, etc...). Lets take
> SBModule::FindSymbols() what returns an SBSymbolContextList without any
> specific order (haven't checked the implementation but I would consider a
> random order to be valid). If a user calls this function, then iterates
> through the elements to find an index `I`, calls `GetContextAtIndex(I)` and
> pass the result into a subsequent function then what will we do. Will we
> capture what did `GetContextAtIndex(I)` returned in the trace and use that
> value or will we capture the value of `I`, call `GetContextAtIndex(I)`
> during reproduction and use that value. Doing the first would be correct in
> this case but would mean we don't call `GetContextAtIndex(I)` while doing
> the second case would mean we call `GetContextAtIndex(I)` with a wrong
> index if the order in SBSymbolContextList is non deterministic. In this
> case as we know that GetContextAtIndex is just an accessor into a vector
> the first option is the correct one but I can imagine cases where this is
> not the case (e.g. if GetContextAtIndex would have some useful side effect).
>

Indeed, in this scenario we would replay the call with the same `I`
resulting in an incorrect value. I think the only solution is fixing the
non-derterminism. This should be straightforward for lists (some kind of
sensible ordering), but maybe there are other issues I'm not aware of.

> Other interesting question is what to do with functions taking raw binary
> data in the form of a pointer + size (e.g. SBData::SetData). I think we
> will have to annotate these APIs to make the reproducer system aware of the
> amount of data they have to capture and then allocate these buffers with
> the correct lifetime during replay. I am not sure what would be the best
> way to attach these annotations but I think we might need a fairly generic
> framework because I won't be surprised if there are more situation when we
> have to add annotations to the API. I slightly related question is if a
> function returns a pointer to a raw buffer (e.g. const char* or void*) then
> do we have to capture the content of it or the pointer for it and in either
> case what is the lifetime of the buffer returned (e.g.
> SBError::GetCString() returns a buffer what goes out of scope when the
> SBError goes out of scope).
>

This a good concern and not something I had a good solution for at this
point. For const char* string we work around this by serializing the actual
string. Obviously that won't always work. Also we have the void* batons for
callsback, which is another tricky thing that wouldn't be supported. I'm
wondering if we can get away with ignoring these at first (maybe printing
something in the replay logic that warns the user that the reproducer
contains an unsupported function?).

> Additionally I am pretty sure we have at least some functions returning
> various indices what require remapping other then the pointers either
> because they are just indexing into a data structure with undefined
> internal order or they referencing some other resource. Just by randomly
> browsing some of the SB APIs I found for example SBHostOS::ThreadCreate
> what returns the pid/tid for the newly created thread what will have to be
> remapped (it also takes a function as an argument what is a problem as
> well). Because of this I am not sure if we can get away with an
> automatically generated set of API descriptions instead of wring one with
> explicit annotations for the various remapping rules.
>

Fixing the non-determinism should also address this, right?

> If there is interest I can try to take a deeper look into the topic
> sometime later but I hope that those initial thoughts are useful.
>

Thank you. I'll start by incorporating the feedback and ping the thread
when the patch is ready for another look.

> Tamas
>
> On Mon, Jan 7, 2019 at 9:40 AM Pavel Labath <pavel at labath.sk> wrote:
>
>> On 04/01/2019 22:19, Jonas Devlieghere via lldb-dev wrote:
>> > Hi Everyone,
>> >
>> > In September I sent out an RFC [1] about adding reproducers to LLDB.
>> > Over the
>> > past few months, I landed the reproducer framework, support for the GDB
>> > remote
>> > protocol and a bunch of preparatory changes. There's still an open code
>> > review
>> > [2] for dealing with files, but that one is currently blocked by a
>> change to
>> > the VFS in LLVM [3].
>> >
>> > The next big piece of work is supporting user commands (e.g. in the
>> > driver) and
>> > SB API calls. Originally I expected these two things to be separate,
>> but
>> > Pavel
>> > made a good case [4] that they're actually very similar.
>> >
>> > I created a prototype of how I envision this to work. As usual, we can
>> > differentiate between capture and replay.
>> >
>> > ## SB API Capture
>> >
>> > When capturing a reproducer, every SB function/method is instrumented
>> > using a
>> > macro at function entry. The added code tracks the function identifier
>> > (currently we use its name with __PRETTY_FUNCTION__) and its arguments.
>> >
>> > It also tracks when a function crosses the boundary between internal and
>> > external use. For example, when someone (be it the driver, the python
>> > binding
>> > or the RPC server) call SBFoo, and in its implementation SBFoo calls
>> > SBBar, we
>> > don't need to record SBBar. When invoking SBFoo during replay, it will
>> > itself
>> > call SBBar.
>> >
>> > When a boundary is crossed, the function name and arguments are
>> > serialized to a
>> > file. This is trivial for basic types. For objects, we maintain a table
>> that
>> > maps pointer values to indices and serialize the index.
>> >
>> > To keep our table consistent, we also need to track return for
>> functions
>> > that
>> > return an object by value. We have a separate macro that wraps the
>> returned
>> > object.
>> >
>> > The index is sufficient because every object that is passed to a
>> > function has
>> > crossed the boundary and hence was recorded. During replay (see below)
>> > we map
>> > the index to an address again which ensures consistency.
>> >
>> > ## SB API Replay
>> >
>> > To replay the SB function calls we need a way to invoke the
>> corresponding
>> > function from its serialized identifier. For every SB function, there's
>> a
>> > counterpart that deserializes its arguments and invokes the function.
>> These
>> > functions are added to the map and are called by the replay logic.
>> >
>> > Replaying is just a matter looping over the function identifiers in the
>> > serialized file, dispatching the right deserialization function, until
>> > no more
>> > data is available.
>> >
>> > The deserialization function for constructors or functions that return
>> > by value
>> > contains additional logic for dealing with the aforementioned indices.
>> The
>> > resulting objects are added to a table (similar to the one described
>> > earlier)
>> > that maps indices to pointers. Whenever an object is passed as an
>> > argument, the
>> > index is used to get the actual object from the table.
>> >
>> > ## Tool
>> >
>> > Even when using macros, adding the necessary capturing and replay code
>> is
>> > tedious and scales poorly. For the prototype, we did this by hand, but
>> we
>> > propose a new clang-based tool to streamline the process.
>> >
>> > For the capture code, the tool would validate that the macro matches the
>> > function signature, suggesting a fixit if the macros are incorrect or
>> > missing.
>> > Compared to generating the macros altogether, it has the advantage that
>> we
>> > don't have "configured" files that are harder to debug (without faking
>> line
>> > numbers etc).
>> >
>> > The deserialization code would be fully generated. As shown in the
>> prototype
>> > there are a few different cases, depending on whether we have to
>> account for
>> > objects or not.
>> >
>> > ## Prototype Code
>> >
>> > I created a differential [5] on Phabricator with the prototype. It
>> > contains the
>> > necessary methods to re-run the gdb remote (reproducer) lit test.
>> >
>> > ## Feedback
>> >
>> > Before moving forward I'd like to get the community's input. What do
>> you
>> > think
>> > about this approach? Do you have concerns or can we be smarter
>> > somewhere? Any
>> > feedback would be greatly appreciated!
>> >
>> > Thanks,
>> > Jonas
>> >
>> > [1] http://lists.llvm.org/pipermail/lldb-dev/2018-September/014184.html
>> > [2] https://reviews.llvm.org/D54617
>> > [3] https://reviews.llvm.org/D54277
>> > [4] https://reviews.llvm.org/D55582
>> > [5] https://reviews.llvm.org/D56322
>> >
>> > _______________________________________________
>> > lldb-dev mailing list
>> > lldb-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>> >
>>
>> [Adding Tamas for his experience with recording and replaying APIs.]
>>
>>
>> Thank you for sharing the prototype Jonas. It looks very interesting,
>> but there are a couple of things that worry me about it.
>>
>> The first one is the usage of __PRETTY_FUNCTION__. That sounds like a
>> non-starter even for an initial implementation, as the string that
>> expands to is going to differ between compilers (gcc and clang will
>> probably agree on it, but I know for a fact it will be different on
>> msvc). It that was just an internal property of the serialization
>> format, then it might be fine, but it looks like you are hardcoding the
>> values in code to connect the methods with their replayers, which is
>> going to be a problem.
>>
>> I've been thinking about how could this be done better, and the best
>> (though not ideal) way I came up with is using the functions address as
>> the key. That's guaranteed to be unique everywhere. Of course, you
>> cannot serialize that to a file, but since you already have a central
>> place where you list all intercepted functions (to register their
>> replayers), that place can be also used to assign unique integer IDs to
>> these functions. So then the idea would be that the SB_RECORD macro
>> takes the address of the current function, that gets converted to an ID
>> in the lookup table, and the ID gets serialized.
>>
>> The part that bugs me about this is that taking the address of an
>> overloaded function is extremely tedious (you have to write something
>> like static_cast<function_prototype>(&SBFoo::Bar)). That would mean all
>> of these things would have to be passed to the RECORD macro. OTOH, the
>> upshot of this would be that the macro would now have sufficient
>> information to perform pretty decent error checking on its invocation.
>> Another nice about this could be that once you already have a prototype
>> and an address of the function, it should be possible (with sufficient
>> template-fu) to synthesize replay code for the function automatically,
>> at least in the simple cases, which would avoid the repetitiveness of
>> the current replay code. Together, this might obviate the need for any
>> clang plugins or other funny build steps.
>>
>>
>> The second thing I noticed is the usage of pointers for identifying
>> object. A pointer is good for that but only while the object it points
>> to is alive. Once the object is gone, the pointer can (and most likely
>> will) be reused. So, it sounds to me like you also need to track the
>> lifetime of these objects. That may be as simple as intercepting
>> constructor/destructor calls, but I haven't seen anything like that yet
>> (though I haven't looked at all details of the patch).
>>
>> Tying into that is the recording of return values. It looks like the
>> current RECORD_RETURN macro will record the address of the temporary
>> object in the frame of the current function. However, that address will
>> become invalid as soon as the function returns as the result object will
>> be copied into a location specified by the caller as a part of the
>> return processing. Are you handling this in any way?
>>
>> The final thing, which I noticed is the lack of any sign of threading
>> support. I'm not too worried about that, as that sounds like something
>> that could be fitted into the existing framework incrementally, but it
>> is something worth keeping in mind, as you're going to run into that
>> pretty soon.
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20190107/e8c5fd50/attachment-0001.html>