[Lldb-commits] [PATCH] D78801: [LLDB] Add class ProcessWasm for WebAssembly debugging

Wed Apr 29 10:11:22 PDT 2020

paolosev added a comment.

In D78801#2009856 <https://reviews.llvm.org/D78801#2009856>, @labath wrote:

> What if the wasm engine actually made some effort to present a more "conventional" view of the wasm "process"? I.e., in addition to the "implied" PC, we could have an "implied" SP (and maybe an FP too). Then it could lay out the call stack and the function arguments in "memory", and point the "SP" to it so that lldb is able to reconstruct the frames&variables using the "normal" algorithm. For the sake of exposition, lets assume the following "encoding" of the 64 bit memory space:
>
>   unsigned:16 type; // 0x5555 = stack
>   unsigned:16 tid; // are there threads in wasm?
>   unsigned:16 frame; // grows down, 0x0000 = bottommost frame (main), 0x0001 = second from bottom, etc.
>   unsigned:16 address;
>
>
> Then the engine could say that the "SP" of thread 0x1234 is `0x5555123400050000`, "FP" is `0x5555123400048010 and the have the memory contents be
>
>   0x5555123400048020: // third "local" (of frame 4)
>   0x5555123400048018: // second "local" (of frame 4)
>   0x5555123400048010: // first "local" (of frame 4)
>   0x5555123400048008: 0x5555123400038010 // previous FP (frame 3)
>   0x5555123400048000: ???? // previous PC (frame 3)
>   0x5555123400040010: // third "argument" (of frame 4)
>   0x5555123400040008: // second "argument" (of frame 4)
>   0x5555123400040000: // first "argument" (of frame 4)
>   0x5555123400038020: // third "local" (of frame 3)
>   0x5555123400038018: // second "local" (of frame 3)
>   0x5555123400038010: // first "local" (of frame 3)
>   0x5555123400038008: 0x5555123400028010 // previous FP (frame 2)
>   0x5555123400038000: ???? // previous PC (frame 2)
>   etc.
>
>
> Then all it would be needed is to translate `DW_OP_WASM_location` into an appropriate `FP+offset` combo. Somehow...
>
> I realize that this is basically throwing the problem "over the fence", and asking the other side to deal with things, but I am starting to get sceptical that we will be able to come up with a satisfactory solution within lldb.

When you say

> translate DW_OP_WASM_location into an appropriate FP+offset combo.

do you mean that LLVM should generate these `FP+offset` combos rather than `DW_OP_WASM_location` or that LLDB should somehow do this translation?
I think the engine can do more to help, here, but not a lot more; I am afraid. Yes, it could expose an implied “SP” and “FP”, and that should be sufficient to represent locals and arguments and make stack walking more orthodox. But DW_OP_WASM_location also describes locations in the set of wasm globals and in the Wasm operand stack, so we would need at least a second. parallel stack to represent the operand stack.

Also, for C++ LLVM emits code to maintain a “shadow stack” in the linear memory of the module, and location expressions like `DW_OP_fbreg +N` are already used to describe the location of a parameter or a local variable in that shadow stack. The stack frame pointer for that function is described with `DW_AT_frame_base`, expressed as a DW_OP_WASM_location expression.

In the end walking the stack is not a big problem, its logic can already be encapsulated in a Unwind-derived plugin class. The issues are:

- in `DWARFExpression::Evaluate`, where we need to handle DW_OP_WASM_location somehow, and
- in `Value::GetValueAsData`, where we need to read from the memory of the current Wasm module, which is a space separated from the address space of code.

I understand that it is not easy to plug in this functionality in a very neat way, and maybe I am missing something else here, but if there are no other places involved maybe we can come up with a clean solution.

> And I believe the current problems are just the tip of the iceberg. I can't imagine what hoops we'll need to jump through once we start evaluating expressions...

Expression evaluation works, in my prototype, for simple expressions.  For complex expressions I see logged errors like this, in `IRInterpreter::CanInterpret()`:

  Unsupported instruction: %call = call float @_ZNK4Vec3IfE3dotERKS0_(%class.Vec3* %7, %class.Vec3* dereferenceable(12) %8)

It’s not clear to me if the problem is caused by the debug symbols or by the IR generated for Wasm… is there any doc where I could learn more about expression evaluation in LLDB? It’s a topic that really interests me, even outside the scope of this Wasm work.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78801/new/

https://reviews.llvm.org/D78801