[lldb-dev] Rust language support question
Jim Ingham via lldb-dev
lldb-dev at lists.llvm.org
Fri Jan 26 18:16:07 PST 2018
I second Greg, we don't really want to make the lldb_private namespace into API. We don't have any intention to stabilize this, and the interfaces expose a lot of clang & llvm types which are also not stable and change not infrequently. So in effect your plugin would end up being rev-locked to the version of lldb it was built with anyway. I'm not sure you'd gain all that much by trying to make an external plugin. The plugin structure on disk is fairly well separated language by language, so it wouldn't be hard to overlay your support and then build in-tree even if your code isn't in the llvm.org repository. OTOH I would be happy to see Rust support in the main-line lldb, so I'm not encouraging this approach.
This was implicit in what Greg said but unlike gdb, lldb doesn't have a language neutral internal representation for types. Rather each language can provide it's own internal representation for types. The current Rust support in lldb amounts to telling lldb that "Rust types can be handled by clang's type representation." I naively thought this would make support for Rust weak, but folks on Stack Overflow say it actually works pretty well for viewing variables (using "frame var" or lldb's ValueObject's). Stepping and so forth apparently seemed to be working okay as well. Depending on how far off this actually is, you might be able to reuse the Clang TypeSystem and do mutatis mutandis for the differences? That would certainly be a lot simpler than inventing another type representation.
Note, I think the jury's still out on whether it was a great idea to have the swift type representation be the swift compiler's internal state. It has proven more than a little fragile. I'm not sure I would suggest that route. I vaguely remember back in the day there was a -g flag in gcc that produced a compiler state dump that gdb was supposed to read. But IIRC that ended up being more trouble than it was worth.
Anyway, to the point of using Rust code in the lldb codebase. I don't think that we want to have lldb's build depend on having a Rust compiler present. That seems like a fairly heavy-weight requirement for all the folks who don't work on Rust. But at least I wouldn't mind having some code that was conditionalized on the presence of the Rust compiler, and when it was absent the Rust TypeSystem would just return a no-op expression parser. I don't know how other people feel about that, however. Of course, this would only be worth the effort if your Rust compiler front-end was planning to call back to lldb say for name & type resolution. If it is stand-alone then it might as well be some external library you call out to. If all the parser library needed to do was call back to query for name resolution as it went along, though you very well could do that using the SB API's. Those we do support writing external libraries against, our strong policy is to maintain binary compatibility through the SB API's.
Another little quirk of lldb that might affect your planning is that unlike gdb, we don't have full calling-convention emulation in lldb at present. So we don't have the ability to marshal the arguments for any old randomly complicated function. That means the gdb strategy of parsing the expression till you get to a point where you want to call a function, then just hand calling it off to one side, isn't currently possible. The Go Expression parser just fails if there's any expression that requires a function call.
The way we get around this in lldb is that we marshal the inputs to the expression as well as the return type into an lldb-defined struct we cons up for the expression. Then we write a wrapper function that contains the expression but uses the input values from the argument struct. That way we only need to call a function that gets passed in a pointer to this argument struct, which tends to be pretty easy to do. Then we JIT and insert that function, create the input argument structure and call our wrapper function. That has some nice side-benefits like for breakpoint condition expressions we only need to parse & insert the expression once and then we can pretty cheaply call it on each breakpoint hit. In actuality it's a little more complicated than this, but anyway it means a lot of work gets offloaded to the ExpressionParser, but we avoid having to encode the calling convention in detail in lldb.
Since we already have a compiler sitting around that seemed a good "don't redo work" trade-off. As a side note, it was also necessitated by the fact that though llvm knows about the calling convention, it didn't for a long time know how to express that in terms of clang types. And there wasn't an easy way to cross the clang type -> llvm type barrier. IIUC, that barrier is no longer present, but there still aren't useful API's to express "for an argument list of these Clang types, where would the arguments go. So if it were important to get this ability for your work, I think the remaining bit is to convert the results of computing the argument locations from clang into some form useful to lldb (DWARF expressions would be a good choice). Of course if Rust doesn't use the C calling conventions on the platform you'll have to adjust for that as well.
This is just a random collection of thoughts, but it's Friday afternoon so that seemed appropriate. Hope they are of some help...
> On Jan 26, 2018, at 3:49 PM, Greg Clayton via lldb-dev <lldb-dev at lists.llvm.org> wrote:
>> On Jan 26, 2018, at 2:54 PM, Tom Tromey via lldb-dev <lldb-dev at lists.llvm.org> wrote:
>> Hi. I'm working on adding Rust language support to lldb.
>> One question that's come up is the best way to handle expression
>> On the one hand, it would be convenient to reuse an existing parser --
>> the one that we discussed was the "syn" crate. But, this is a Rust
> We have great support for multiple language in LLDB. First off you will need to subclass lldb_private::TypeSystem. See lldb_private::ClangASTContext for the most complete version. Your type system allows you to use an abstract syntax tree that makes sense for your language. For clang, we actually create a clang::ASTContext using the code from the compiler and then we translate debug info from DWARF into clang types. So you would probably want to make a "lldb_private::RustASTContext". Then, if you use DWARF as your debug info, you will want to make subclass DWARFASTParser with something like DWARFASTParserRust. This is where you will translate types from DWARF back into your custom AST types. TypeSystem then becomes your main type abstraction within LLDB. There are many virtual functions in it that you must override and some that you might want to override. Types in LLDB are handed out as "lldb_private::CompilerType". This class contains a "lldb_private::TypeSystem *" plus a "void *" which can point to what ever makes sense if your lldb_private::RustASTContext class. Similarly CompilerDecl and CompilerDeclContext also have a "lldb_private::TypeSystem *" plus a "void *" so you can hand out declarations and namespaces, etc.
> When it comes to expressions, we will dig up the right lldb_private::TypeSystem for a given stack frame language and then we will call lldb_private::TypeSystem::GetUserExpression(...) and your type system can evaluate your expression exactly as your language should.
>> So then there's the question of how to ship it. Directly using the syn
>> crate would mean having Rust code in-tree. Or, perhaps the Rust parts
>> could be shipped as a shared library or an external executable.
> You can make your lldb_private::RustASTContext have a custom setting in LLDB that specifies a path to a shared library to load. You could keep all of the functionality out of LLDB that way, just have lldb_private::RustASTContext shim over to the shared library to do the work. Or you can build it right into LLDB. It really depends on how sharable your compiler code is and if the code can be used in another program.
>> Are either of these doable? What do other language plugins do?
> It is. All other languages compile in their support right now, but that is mostly due to the lack of a stable API in clang. Clang has no library interface that exports all of the details that we need, so we just build it into LLDB. Swift does the same thing: a full compiler is included in LLDB itself. Go and Java and OCaml all write their own minimal debug layer that doesn't depend on the compiler codebase at all.
> It takes some work to build your compiler so that is can share its implementation with the debugger, but the investment can pay off:
> - expressions in clang can use latest language features just by updating code
> - no need to maintain a separate expression parser that constantly gets out of date with the current compiler
> - no need to invent a type system, just use your native AST. This also helps ensure you can recreate any types from DWARF since if some info is missing in DWARF, you won't be able to convert it back into your AST format without losing something
> Swift did a slight different thing that you might want to think about: if your compiler can serialize an AST when it builds your program, you can put that serialized AST into the executable, along with debug info. When you debug, you can deserialize the AST and hand it right back to the compiler! With Swift that was great because the compiler guys would change the language and add new features. As they did this, we wouldn't always have work to do in LLDB, because they would store a blob of info in the binary, and we would hand it back to them. The debug info didn't actually have types in it, the DWARF just had mangled names for the types. We take those mangled names, and hand them to the compiler code, and it would use the mangled names to locate the type and hand it back to us. So we didn't need to know anything about the type, just hand a string to the compiler and it would hand us back a type. Swift also has generics, like a "Dictionary<String, String>" and even these types aren't each contained in the AST, but they can be re-created on the fly by the compiler code. This all was hidden behind the "there is a mangle type name, please give me a type back".
>> My original plan here was to simply make the entire language support an
>> external plugin. But, from what I can tell this isn't possible -- the
>> necessary DWARF-related headers aren't installed. So maybe this could
>> be changed? This would provide us with the most flexibility I think.
> I wouldn't recommend that. I would recommend making stable API to your custom shared library that lldb_private::RustASTContext knows how to load. Many of the virtual functions in TypeSystem already will show you the layers we need. The LLDB internal API is not stable and shouldn't be used or exported through to other plug-ins. It is possible, but the shear number of HUGE C++ names would make the any library export produce very large LLDB binaries. Right now we really take care to only export a sensible API. So if you need to be out of process, design an API into Rust that produces a shared library that LLDB can use and that you commit to. As you update Rust, you can then just update the shared library and then re-run LLDB. Does that sound feasible? If we need to develope an API for pluggable type systems in LLDB, maybe we can use your rust shared library as a basis for others in the future.
>> A final idea is to do what I did for gdb, and simply write a new parser
>> in C++. Doable, but for me I think a last resort.
> I agree that this is a last resort kind of approach. Lots of work keeping the compiler an expression parser in sync. Better to architect the compiler correctly so that the code can be re-used.
> Let me know if you have an questions about anything I said above,
> Greg Clayton
>> lldb-dev mailing list
>> lldb-dev at lists.llvm.org
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
More information about the lldb-dev