[Lldb-commits] [PATCH] D72751: [LLDB] Add DynamicLoaderWasmDYLD plugin for WebAssembly debugging

Thu Jan 23 01:33:56 PST 2020

labath added a subscriber: aadsm.
labath added a comment.

[ looping in @aadsm for the svr4 stuff ]

Thanks for adding the test, and for the detailed writeup. Please find my comments inline.

In D72751#1835502 <https://reviews.llvm.org/D72751#1835502>, @paolosev wrote:

> The first is that we need to call `Target::SetSectionLoadAddress()` twice, from two different places. First we need to call `Target::SetSectionLoadAddress()` in `ObjectFileWasm::SetLoadAddress()`, and then again in `DynamicLoaderWasmDYLD::DidAttach()`. The reason for this seems to originate in the sequence of function calls:
>
> In `DynamicLoaderWasmDYLD::DidAttach()` we call `ProcessGDBRemote::LoadModules()` to get list of loaded modules from the remote (Wasm engine).
>  `ProcessGDBRemote::LoadModules()` calls, first:
>
> - `DynamicLoaderWasmDYLD::LoadModuleAtAddress()` and from there: ...
>
>   then:
> - `Target::SetExecutableModule() -> Target::ClearModules() -> SectionLoadList::Clear()`
>
>   So, at the end of `LoadModules()` in `DynamicLoaderWasmDYLD::DidAttach()` the SectionLoadList is empty, and we need to set it again by calling `Target::.SetSectionLoadAddress()` again. This works but the duplication is ugly; is there a way to improve this?

I hope so. :) This seems like a bug in `ProcessGDBRemote::LoadModules`. It seems wrong/wasteful/etc to do all this work to compute the section load addresses only to have them be thrown away by `SetExecutableModule`. Maybe all it would take is to reverse the order of these two actions, so that the load addresses persist? Can you try something like that?

On a side note, `ProcessGDBRemote::LoadModules` seems a bit internally inconsistent. At one place it claims that "The main executable will never be included in libraries-svr4", but then it goes on to set an executable module anyway. This could in fact be a clue as to why this problem hasn't showed up on other platforms -- if the remote does not send the executable, then SetExecutableModule is not called.

On a side-side note, this may mean that you sending the main wasm file through `qXfer:libraries-svr4` may not be correct. However, fixing that would mean finding another way to communicate the main executable name/address. I don't think we currently have an easy way to do that so it may be better to fix `ProcessGDBRemote::LoadModules`, given that it "almost" supports executables.

I am also worried about the fact that `SymbolFileDWARF::CalculateAbilities` requires the module to be "loaded". That shouldn't be normally required. That function does a very basic check on some sections, and this should work fine without those sections having a "load address", even if they are actually being loaded from target memory. I think this means there are some additional places where ObjectFileWasm should use `m_memory_addr` instead of something else...

> The second problem is that the Code Section needs to be initialized (in `ObjectFileWasm::CreateSections()`) with `m_file_addr = m_file_offset = 0`, and not with the actual file offset of the Code section in the Wasm file. If we set `Section::m_file_addr` and `Section::m_file_offset` to the actual code offset, the DWARF info does not work correctly.
> 
> I have some doubts regarding the DWARF data generated by Clang for a Wasm target. Looking at an example, for a Wasm module that has the Code section at offset 0x57, I see this DWARF data:
> 
>   0x0000000b: DW_TAG_compile_unit
>                 […]
>                 DW_AT_low_pc (0x0000000000000000)
>                 DW_AT_ranges (0x00000000
>                    [0x00000002, 0x0000000e)
>                    [0x0000000f, 0x0000001a)
>                    [0x0000001b, 0x00000099)
>                    [0x0000009b, 0x0000011c))
> 
> 
> The documentation <https://yurydelendik.github.io/webassembly-dwarf/#pc> says that //“Wherever a code address is used in DWARF for WebAssembly, it must be the offset of an instruction relative within the Code section of the WebAssembly file.”// 
>  But is this correct? Shouldn't maybe code addresses be offset-ed by the file address of the Code section?

That's interesting. I don't think that clang is really wrong/non-conforming here, but this choice plays a rather poorly with the way lldb handles object files (and how your remote presents them). In this case special casing the code section may be fine, but you should be aware that there are places in lldb which expect that the "load bias" (the delta between file and load addresses) is the same for all sections in a module (because that's how it works elsewhere), and they may not like this. However, if you have only one code section, I think you should be mostly fine.

I do think that this can be handled in a better way (I'm pretty sure you don't need to zero out the file offset for instance), but we can wait with that until we resolve the first point...

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72751/new/

https://reviews.llvm.org/D72751