[Lldb-commits] [PATCH] D78801: [LLDB] Add class WasmProcess for WebAssembly debugging

Tue May 5 04:48:39 PDT 2020

labath added a comment.

In D78801#2019384 <https://reviews.llvm.org/D78801#2019384>, @clayborg wrote:

> Interesting approach to DWARF expression evaluation, though it might be simpler to leave the DWARFExpression as it is, and have the plug-in part only handle unknown opcodes. Right now if you want to add just one opcode, you must subclass and make a plug-in instance where 99% of opcodes get forwarded to the original implementation and the one new opcode gets handled by the plug-in. What if the DWARFExpression class was passed to a plug-in that only handles unknown opcodes? Might be fewer code changes and be a bit cleaner. The plug-in would handle only the DW_OP_WASM_location and would still be found/detected if and only if one is encountered?

I think that the main reason this is awkward is that the evaluator is plugged in at the level of a single dwarf operation. That requires passing around of a lot of state. If it was plugged in at the level of evaluating the entire expression, then the amount of state to pass is much smaller. Obviously, the evaluation itself would then need to be factored into multiple functions so that one can override just the evaluation of a single expression, but that's pretty standard software engineering work, and something that we probably should do for code health anyway.

In fact, I believe that if we do this right then the result could be much cleaner that the current situation. Going off of the idea of caching the evaluator in the module, what if we don't "cache" the evaluator itself, but actually a factory (function) for it. The advantage of that (the factory function creating a evaluator instance for a specific expression) is that we could store a lot of the evaluation state in the evaluator object, instead of passing it all around through function arguments (I find the long function argument lists to be one of the main problems of the current DWARFExpression class).

> As for unwinding, I still don't think we need the UnwindWASM class. See my inline comment in "Unwind &Thread::GetUnwinder()" for a bit of reasoning. It is very common for runtimes for languages to support supplying the stack frames for a thread, so this should be built into the lldb_private::Process/lldb_private::Thread classes. For stack unwindind I would suggest adding functions to lldb_private::Thread:
> 
>   class Thread {
>     /// Check if the runtime supports unwinding call stacks and return true if so.
>     ///
>     /// If the language runs in a runtime that knows how to unwind the call stack
>     /// for a thread, then this function should return true.
>     ///
>     /// If true is returned, unwinding will use a RuntimeUnwind class that will call
>     /// into this class' Thread::GetRuntimeFrame* functions to do the unwinding.
>     /// If false is returned, the standard UnwindLLDB will be used where unwinding
>     /// will be done using registers, unwind info and other debug info.
>     virtual bool HasRuntimeUnwindSupport() {
>       return false; // Default to using UnwindLLDB()
>     }
>     virtual uint32_t GetRuntimeFrameCount() {
>       return 0;
>    }
>    virtual bool GetRuntimeFrameInfoAtIndex(uint32_t frame_idx, lldb::addr_t &cfa, lldb::addr_t &pc, bool &behaves_like_zeroth_frame) {
>     return false;
>    }
>    virtual lldb::RegisterContextSP GetRuntimeFrameRegisterContext(uint32_t frame_idx) {
>     return lldb::RegisterContextSP();
>    } 
> 
> 
> Then either ThreadGDBRemote, or possibly a subclass like ThreadWasm will implement these virtual functions.

If we have `ThreadWasm` then I believe we don't need any of this as `ThreadWasm` can just override the appropriate function which returns the unwinder object.

> For the GetWasmLocal, GetWasmGlobal, GetWasmStackValue, I still think abstracting this into the lldb_private::Process/lldb_private::Thread is the right way to do this. The way you have this right now, there is not way to tell how big the buffer needs to be to fetch any of these local/global/stack values. They all assume 16 bytes is enough.

For me, the main one of the advantages of having a wasm-specific class is that it can make wasm-specific assumptions. That said, the apis in question are definitely very c-like and could definitely be brought into the c++14 world.

================
Comment at: lldb/source/Interpreter/CommandInterpreter.cpp:738-755
+  std::unique_ptr<CommandObjectRegexCommand> connect_wasm_cmd_up(
+      new CommandObjectRegexCommand(
+          *this, "wasm",
+          "Connect to a WebAssembly process via remote GDB server.  "
+          "If no host is specifed, localhost is assumed.",
+          "wasm [<hostname>:]<portnum>", 2, 0, false));
+  if (connect_wasm_cmd_up) {
----------------
One way to improve this would be to have lldb detect the kind of plugin that it is talking to and create an appropriate instance based on that. However, that would require more refactorings, so this is good for a start anyway.

================
Comment at: lldb/source/Plugins/Process/wasm/WasmProcess.cpp:185
+
+bool WasmProcess::GetWasmCallStack(std::vector<lldb::addr_t> &call_stack_pcs) {
+  call_stack_pcs.clear();
----------------
clayborg wrote:
> This API seems wrong to be on the process. It should be on the ThreadWasm class (if we end up with one). see my main comments for more details.
I guess that's because Paolo was avoiding (or not being able to) create ThreadWasm objects. If we make that possible (per my other comment) then this should be doable as well.

================
Comment at: lldb/source/Target/Thread.cpp:1857-1863
+  if (!m_unwinder_up) {
+    if (CalculateTarget()->GetArchitecture().GetMachine() ==
+        llvm::Triple::wasm32)
+      m_unwinder_up.reset(new wasm::UnwindWasm(*this));
+    else
+      m_unwinder_up.reset(new UnwindLLDB(*this));
+  }
----------------
clayborg wrote:
> If the call stack is available through the Process/Thread interface, I think we should be asking the thread for this information. So this code could be:
> 
> ```
> if (!m_unwinder_up) {
>     if (HasRuntimeUnwindSupport())
>       m_unwinder_up.reset(new RuntimeUnwind(*this));
>     else
>       m_unwinder_up.reset(new UnwindLLDB(*this));
>   }
> ```
> 
> RuntimeUnwind would be a class that will fetch the stack frame information through the new virtual functions on the Thread class, but only if the virtual Thread::HasRuntimeUnwindSupport() returns true. As new languages and runtimes are added in the future I don't want to see this function look like:
> 
> ```
> if (!m_unwinder_up) {
>     if (CalculateTarget()->GetArchitecture().GetMachine() ==
>         llvm::Triple::wasm32)
>       m_unwinder_up.reset(new wasm::UnwindWasm(*this));
>     else if (CalculateTarget()->GetArchitecture().GetMachine() ==
>         llvm::Triple::Rust)
>       m_unwinder_up.reset(new rust::UnwindRust(*this));
>     else if (CalculateTarget()->GetArchitecture().GetMachine() ==
>         llvm::Triple::Go)
>       m_unwinder_up.reset(new go::UnwindGo(*this));
>     else
>       m_unwinder_up.reset(new UnwindLLDB(*this));
>   }
> ```
> 
> So we should find a way to generalize the stack frames being fetched from the process/thread classes using virtual functions. 
> 
> I know this is the way GDB was built (many ifdefs and arch specific detecting code everywhere), but we have plug-ins in LLDB that are there to abstract us from this kind of code. 
The way I would imagine this happening is that ProcessWasm overrides the appropriate method (if there isn't one we can create it) so that it creates `ThreadWasm`s instead of plain `ThreadGdbRemote`s. Then `ThreadWasm` can override `GetUnwinder` to return an `UnwindWasm` instance.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78801/new/

https://reviews.llvm.org/D78801