[lldb-dev] RFC: full support for python files, and avoid using FILE* internally
Pavel Labath via lldb-dev
lldb-dev at lists.llvm.org
Tue Sep 24 04:11:10 PDT 2019
On 23/09/2019 20:54, Larry D'Anna wrote:
>> On Sep 23, 2019, at 7:11 AM, Pavel Labath <pavel at labath.sk> wrote:
>> On 20/09/2019 17:35, Larry D'Anna via lldb-dev wrote:
>>> Hi lldb-dev.
>>> I want to be able to use LLDB inside of iPython, so I can have mixed python and LLDB debug session.
>>> To this end, I’d like to update LLDB to have full support for python file objects, so the outputs of debugger commands can be redirected into iPython’s own streams.
>>> This however, is difficult to do, because LLDB makes use of FILE* streams in a number of places. This presents two problems. The first is that there is no really
>>> correct way to create SWIG typemaps that handle conversion to FILE* and get the ownership semantics correct. The second problem is that there is not a portable
>>> way to make a FILE* with arbitrary callbacks for reading and writing. On Darwin and BSD there’s funopen, and on linux there’s something else, and I don’t know if
>>> there’s any way on windows.
>>> I made an attempt at this a while ago using funopen a while ago, here:
>>> Zachary Turner suggested a more thorough approach. where instead of trying to use funopen to paper over all the use of FILE* streams, we should make
>>> lldb_private::File capable of doing the dynamic dispatch and excise all the unnecessary FILE* stuff in favor of lldb_private::File.
>>> That’s what I’ve done here: https://github.com/smoofra/llvm-project/tree/files
>>> I’ve posted the first few patches to phabricator for review.
>>> What do you think?
>> Hello Larry,
>> thanks for starting this thread.
>> So, judging by your problem description, it sounds to me like you're primarily interested in the SBCommandInterpreter::HandleCommand family of functions (and by extension, the SBCommandReturnObject class). Would that be a fair thing to say?
> Not really. I want to be able to embed a full LLDB session inside of iPython, which means redirecting anything that prints to the debugger's main output and error streams. Yes, in most cases that will be coming from HandleCommand(), but I really want to avoid the situation where some output that would normally be printed to the terminal is missed under iPython.
Ok, that's fair.
>> The reason I am asking this is that I'm wondering what is the scope of the thing you're proposing to do (and then, whether this is the best way to accomplish that). For instance, if we were only interested in the HandleCommand api, then it might be possible to plug the python in at a higher level (Stream instead of File). I am hoping that doing that might be easier as the Stream class has a simpler interface, and already supports multiple backing implementations (StreamFile, StreamString, ...).
>> Also, doing that would allow to side step some complicated questions. One of the reasons why getting rid of FILE* is so complicated (you're not the first person to try that) is that there are some APIs (libedit mainly), that we just cannot change, and which require a FILE*.
> I saw that. My strategy for dealing with that was to audit the codebase for any use of File::GetStream(). I found the only two places I could not remove the use of GetStream() was libedit and IOHandlerCursesGUI. In my prototype, I deal with that by checking for NULL from GetStream() before libedit or IOHandlerCursesGUI are enabled. In other words, If a File can produce a FILE*, it will. But you can still have a valid File that will return NULL from GetStream. If you set your debugger streams to Files that return NULL from GetStream, then libedit and the curses GUI will be disabled. I think this is a reasonable approach. For my use-case in particular, there is no need for either libedit or the curses gui, because the whole point is to use iPython as the gui. In general, libedit and curses only really make sense if the IO streams are a terminal anyway, so it’s not a problem to disable these features if the IO streams are redirected to python.
Ok, that also sounds like a reasonable position to take. Might be the
only reasonable position, even. Theoretically, one might try to go the
extra mile and try to synthesize a FILE* using fopencookie et al. on
platforms that support that (the only platforms that support libedit and
curses also happen to have a fopencookie equivalent). That's probably
overkill now, but it is nice to have that option open for the future.
>> If you do want to go with the more general change, then I'd like to ask you to give a bit more detail about the your vision of the new role of the lldb_private::File class and its interaction with other major lldb components (SBFile, StreamFile, ???). My understanding (it's been a while since I looked at this in detail) is that the File class can be constructed from both FILE* and a file descriptor and (crucially) it is also able to give back these underlying objects, including converting between the two. Now, I am assuming you're intending to add a third method of constructing a File object (using some python callbacks), but I assume that (due the mentioned lack of funopen etc.) you won't be trying to convert between these types. So, it would be good to spell out what exactly does the File class promise to do, and what happens when (e.g) a pythonified File object makes its way to code (libedit) which requires a FILE*.
> OK. My vision for File is that it’s main promise is to implement File::Read and/or File::Write. Files can be constructed from descriptors, or FILE* streams, and in that case they should be able to give those underlying objects back. But files may also be constructed in other ways. Clients should avoid calling GetDescriptor() or GetStream() if they can help it. If they can’t help it, such as in the case of libedit or IOHandlerCursesGUI, then they should check that they got a valid descriptor or stream before proceeding.
> Files may also implement seek and tell, or not. If not they should return an “operation not supported” error from Seek() and Tell() and the versions of Read() and Write() that take offsets.
Ok, this all sounds perfectly reasonable, but thanks for spelling that
out. Now we have this description ready to attach to as a comment in one
of the patches. :)
I think the only remaining thing that bothers me about all of this is
the proliferation of shared pointers. Right now, each StreamFile object
holds a lldb_private::File instance as a member (so it is uniquely
owned). Your patches change this to shared_ptr<File>, which means that
now we can have multiple StreamFiles sharing ownership of a single File
object. Since Stream objects are already passed around as shared
pointer, this seems like it gives us more flexibility (== opportunity to
mess things up) than we really need. I kind of get why that might be
necessary, and I can imagine that the only reason we did not need that
so far is because the File class allows you to "cheat" and create
multiple File instances pointing to a single FILE* (as long as at most
one of them owns that FILE*).
However, I still can't escape the feeling that there should be some way
to avoid that. Since you're now probably most familiar about these
classes, what do you think about all of this?
More information about the lldb-dev