[PATCH] D71575: [LLDB] Add ObjectFileWasm plugin for WebAssembly debugging

Tue Jan 7 04:19:04 PST 2020

labath added a comment.

In D71575#1803321 <https://reviews.llvm.org/D71575#1803321>, @paolosev wrote:

> In D71575#1793791 <https://reviews.llvm.org/D71575#1793791>, @labath wrote:
>
> > A bunch more comments from me. :)
> >
> > A higher level question I have is whether there's something suitable within llvm for parsing wasm object files that could be reused. I know this can be tricky with files loaded from memory etc., so it's fine if it isn't possible to do that, but I am wondering if you have considered this option.
>
>
> I have considered this option, there is indeed some code duplication because there is already a Wasm parser in class WasmObjectFile (llvm/include/llvm/Object/Wasm.h).
>  However class WasmObjectFile is initialized with a memory buffer that contains the whole module, and this is not ideal. LLDB might create an ObjectFileWasm either from a file or by retrieving specific module chunks from a remote, but it doesn't often need to load the whole module in memory.
>  I think this is the reason why other kind of object files (like ObjectfileELF) are re-implemented in LLDB rather than reusing existing code in LLVM (like ELFFile in llvm/include/llvm/Object/ELF.h).

Thanks for sharing your thoughts. I think the history of ObjectFileELF is a bit more complicated, but yes, being able to load from memory is a good reason for not using the llvm reader right now (also, the parsing code seems to be quite simple).

================
Comment at: lldb/source/Plugins/ObjectFile/wasm/ObjectFileWasm.cpp:252
+
+  ModuleSpec spec(file, ArchSpec("wasm32-unknown-unknown-wasm"));
+  specs.Append(spec);
----------------
paolosev wrote:
> labath wrote:
> > I take it that wasm files don't have anything like a build id, uuid or something similar?
> Not yet. There is a proposal to add a uuid to wasm modules in a custom section, but it's not part of the standard yet.
Ok. Thanks for clarifying. I think somethink like that will be useful, as otherwise you can't tell if you're using the correct separate debug info file.

================
Comment at: lldb/source/Plugins/ObjectFile/wasm/ObjectFileWasm.cpp:342-343
+                        section_type,                  // Section type.
+                        sect_info.offset & 0xffffffff, // VM address.
+                        sect_info.size, // VM size in bytes of this section.
+                        sect_info.offset &
----------------
paolosev wrote:
> labath wrote:
> > Are the debug info sections actually loaded into memory for wasm? Should these be zero (that's what they are for elf)?
> Yes, I was thinking that the debug sections should be loaded into memory; not sure how this works for ELF, how does the debugger find the debug info in that case?
Are you referring to the load-from-memory, or load-from-file scenario? Normally, the debug info is not loaded into memory, and if lldb is loading the module from a file, then the debug info is loaded from the file (and we use the "file offset" field to locate them). Loading files from memory does not work in general (because we can't find the whole file there -- e.g., section headers are missing). The only case it does work is in case of jitted files. I'm not 100% sure how that works, but given that there is no `if(memory)` block in the section creation code, those sections must too get `vm size = 0`.

In practice, I don't think it does not matter that much what you put here -- I expect things will mostly work regardless. I am just trying to make this consistent in some way. If these sections are not found in the memory at the address which you set here (possibly adjusted by the load bias) then I think it makes sense to set `vm_size=vm_addr=0`. If they are indeed present there, then setting it to those values is perfectly fine.

================
Comment at: lldb/source/Plugins/ObjectFile/wasm/ObjectFileWasm.cpp:382
+
+  DecodeSections(load_address);
+
----------------
paolosev wrote:
> labath wrote:
> > This is strange.. I wouldn't expect that the section decoding logic should depend on the actual address that the object is loaded in memory. Can you explain the reasoning here?
> There is a reason for this: DecodeNextSection() calls:
> ```ReadImageData(offset, size)```
> and when the debug info sections are embedded in the wasm module (not loaded from a separated symbol file), ReadImageData() calls Process::ReadMemory() that, for a GDB remote connection, goes to ProcessGDBRemote::DoReadMemory(). Here I pass the offset because it also represents the specific id of the module loaded in the target debuggee process, as explained in the comment above.
What you say about ReadMemory makes sense, but it's not clear to me why you couldn't use the value in `m_memory_addr` for this. For in-memory object file this value should contain the actual address the object file was loaded from (which, I would expect, will include the `module_id` business), and so you wouldn't need the dynamic loader address in order to locate it. I believe this is how the other object file plugins do their in-memory loading..

================
Comment at: lldb/source/Utility/ArchSpec.cpp:226
+    {eByteOrderLittle, 4, 1, 4, llvm::Triple::wasm32, ArchSpec::eCore_wasm32,
+     "wasm32"}
+  };
----------------
add a trailing comma to avoid subsequent needs to touch this line when adding new entries..

================
Comment at: lldb/test/Shell/ObjectFile/wasm/basic.yaml:21-79
+Sections:
+  - Type:            TYPE
+    Signatures:
+      - Index:           0
+        ParamTypes:
+          - I32
+        ReturnTypes:
----------------
Could you remove the sections which are not relevant for this test? This makes it easier to see the correspondence between the yaml and the test expectations..

================
Comment at: lldb/test/Shell/ObjectFile/wasm/debug-sections.yaml:42
+Sections:
+  - Type:            TYPE
+    Signatures:
----------------
Same here (and you don't even have to put actual valid debug info in the debug info sections -- just a couple of random bytes is sufficient).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71575/new/

https://reviews.llvm.org/D71575