[lldb-dev] RFC: full support for python files, and avoid using FILE* internally

Tue Sep 24 10:03:54 PDT 2019

A bit of a tangent, but I've been getting requests to debug Python and C++ together. Things like TensorFlow start in Python, then call into C++ libraries. Users want to be able to debug the Python code as Python (not debugging into Python itself), then step into the C++ libraries. They want to go up and down the stack, switching languages as needed. "What was my Python code doing when it called into the library. Now what is the library doing?"

> -----Original Message-----
> From: lldb-dev <lldb-dev-bounces at lists.llvm.org> On Behalf Of Pavel Labath
> via lldb-dev
> Sent: Tuesday, September 24, 2019 6:11 AM
> To: Larry D'Anna <lawrence_danna at apple.com>
> Cc: lldb-dev at lists.llvm.org
> Subject: [EXT] Re: [lldb-dev] RFC: full support for python files, and avoid using
> FILE* internally
>
> On 23/09/2019 20:54, Larry D'Anna wrote:
> >
> >
> >> On Sep 23, 2019, at 7:11 AM, Pavel Labath <pavel at labath.sk> wrote:
> >>
> >> On 20/09/2019 17:35, Larry D'Anna via lldb-dev wrote:
> >>> Hi lldb-dev.
> >>> I want to be able to use LLDB inside of iPython, so I can have mixed
> python and LLDB debug session.
> >>> To this end, I’d like to update LLDB to have full support for python
> >>> file objects, so the outputs of
> debugger commands can be redirected into iPython’s own streams.
> >>> This however, is difficult to do, because LLDB makes use of FILE* streams
> in a number of places.   This presents two problems.  The first is that there is
> no really
> >>> correct way to create SWIG typemaps that handle conversion to FILE*
> and get the ownership semantics correct.   The second problem is that there
> is not a portable
> >>> way to make a FILE* with arbitrary callbacks for reading and writing.   On
> Darwin and BSD there’s funopen, and on linux there’s something else, and I
> don’t know if
> >>> there’s any way on windows.
> >>> I made an attempt at this a while ago using funopen a while ago, here:
> >>> https://reviews.llvm.org/D38829
> >>> Zachary Turner suggested a more thorough approach. where instead of
> >>> trying to use funopen to paper over all the use of FILE* streams, we
> should make lldb_private::File capable of doing the dynamic dispatch and
> excise all the unnecessary FILE* stuff in favor of lldb_private::File.
> >>> That’s what I’ve done here:
> >>> https://github.com/smoofra/llvm-project/tree/files
> >>> I’ve posted the first few patches to phabricator for review.
> >>> https://reviews.llvm.org/D67793
> >>> https://reviews.llvm.org/D67792
> >>> https://reviews.llvm.org/D67789
> >>> What do you think?
> >>
> >>
> >>
> >> Hello Larry,
> >>
> >> thanks for starting this thread.
> >>
> >> So, judging by your problem description, it sounds to me like you're
> primarily interested in the SBCommandInterpreter::HandleCommand family of
> functions (and by extension, the SBCommandReturnObject class). Would that
> be a fair thing to say?
> >
> > Not really.  I want to be able to embed a full LLDB session inside of iPython,
> which means redirecting anything that prints to the debugger's main output
> and error streams.     Yes, in most cases that will be coming from
> HandleCommand(), but I really want to avoid the situation where some output
> that would normally be printed to the terminal is missed under iPython.
>
> Ok, that's fair.
>
> >
> >> The reason I am asking this is that I'm wondering what is the scope of the
> thing you're proposing to do (and then, whether this is the best way to
> accomplish that). For instance, if we were only interested in the
> HandleCommand api, then it might be possible to plug the python in at a
> higher level (Stream instead of File). I am hoping that doing that might be
> easier as the Stream class has a simpler interface, and already supports
> multiple backing implementations (StreamFile, StreamString, ...).
> >>
> >> Also, doing that would allow to side step some complicated questions. One
> of the reasons why getting rid of FILE* is so complicated (you're not the first
> person to try that) is that there are some APIs (libedit mainly), that we just
> cannot change, and which require a FILE*.
> >
> > I saw that.   My strategy for dealing with that was to audit the codebase for
> any use of File::GetStream().   I found the only two places I could not remove
> the use of GetStream() was libedit and IOHandlerCursesGUI.    In my
> prototype, I deal with that by checking for NULL from GetStream() before
> libedit or IOHandlerCursesGUI are enabled.     In other words, If a File can
> produce a FILE*, it will.   But you can still  have a valid File that will return NULL
> from GetStream.       If you set your debugger streams to Files that return NULL
> from GetStream, then libedit and the curses GUI will be disabled.    I think this
> is a reasonable approach.    For my use-case in particular, there is no need for
> either libedit or the curses gui, because the whole point is to use iPython as
> the gui.      In general, libedit and curses only really make sense if the IO
> streams are a terminal anyway, so it’s not a problem to disable these features
> if the IO streams are redirected to python.
>
> Ok, that also sounds like a reasonable position to take. Might be the only
> reasonable position, even. Theoretically, one might try to go the extra mile
> and try to synthesize a FILE* using fopencookie et al. on platforms that
> support that (the only platforms that support libedit and curses also happen to
> have a fopencookie equivalent). That's probably overkill now, but it is nice to
> have that option open for the future.
>
> >
> >> If you do want to go with the more general change, then I'd like to ask you
> to give a bit more detail about the your vision of the new role of the
> lldb_private::File class and its interaction with other major lldb components
> (SBFile, StreamFile, ???). My understanding (it's been a while since I looked at
> this in detail) is that the File class can be constructed from both FILE* and a
> file descriptor and (crucially) it is also able to give back these underlying
> objects, including converting between the two. Now, I am assuming you're
> intending to add a third method of constructing a File object (using some
> python callbacks), but I assume that (due the mentioned lack of funopen etc.)
> you won't be trying to convert between these types. So, it would be good to
> spell out what exactly does the File class promise to do, and what happens
> when (e.g) a pythonified File object makes its way to code (libedit) which
> requires a FILE*.
> >
> > OK.   My vision for File is that it’s main promise is to implement File::Read
> and/or File::Write.   Files can be constructed from descriptors, or FILE*
> streams, and in that case they should be able to give those underlying objects
> back.    But files may also be constructed in other ways.  Clients should avoid
> calling GetDescriptor() or GetStream() if they can help it.   If they can’t help it,
> such as in the case of libedit or IOHandlerCursesGUI, then they should check
> that they got a valid descriptor or stream before proceeding.
> >
> > Files may also implement seek and tell, or not.  If not they should return an
> “operation not supported” error from Seek() and Tell() and the versions of
> Read() and Write() that take offsets.
> >
>
> Ok, this all sounds perfectly reasonable, but thanks for spelling that out. Now
> we have this description ready to attach to as a comment in one of the
> patches. :)
>
> I think the only remaining thing that bothers me about all of this is the
> proliferation of shared pointers. Right now, each StreamFile object holds a
> lldb_private::File instance as a member (so it is uniquely owned). Your
> patches change this to shared_ptr<File>, which means that now we can have
> multiple StreamFiles sharing ownership of a single File object. Since Stream
> objects are already passed around as shared pointer, this seems like it gives us
> more flexibility (== opportunity to mess things up) than we really need. I kind
> of get why that might be necessary, and I can imagine that the only reason we
> did not need that so far is because the File class allows you to "cheat" and
> create multiple File instances pointing to a single FILE* (as long as at most one
> of them owns that FILE*).
>
> However, I still can't escape the feeling that there should be some way to
> avoid that. Since you're now probably most familiar about these classes, what
> do you think about all of this?
>
> regards,
> pl
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev