[Lldb-commits] [PATCH] D72751: [LLDB] Add DynamicLoaderWasmDYLD plugin for WebAssembly debugging
Greg Clayton via Phabricator via lldb-commits
lldb-commits at lists.llvm.org
Mon Jan 27 11:25:43 PST 2020
clayborg added a comment.
In D72751#1835656 <https://reviews.llvm.org/D72751#1835656>, @labath wrote:
> [ looping in @aadsm for the svr4 stuff ]
>
> Thanks for adding the test, and for the detailed writeup. Please find my comments inline.
>
> In D72751#1835502 <https://reviews.llvm.org/D72751#1835502>, @paolosev wrote:
>
> > The first is that we need to call `Target::SetSectionLoadAddress()` twice, from two different places. First we need to call `Target::SetSectionLoadAddress()` in `ObjectFileWasm::SetLoadAddress()`, and then again in `DynamicLoaderWasmDYLD::DidAttach()`. The reason for this seems to originate in the sequence of function calls:
> >
> > In `DynamicLoaderWasmDYLD::DidAttach()` we call `ProcessGDBRemote::LoadModules()` to get list of loaded modules from the remote (Wasm engine).
> > `ProcessGDBRemote::LoadModules()` calls, first:
> >
> > - `DynamicLoaderWasmDYLD::LoadModuleAtAddress()` and from there: ...
> >
> > then:
> > - `Target::SetExecutableModule() -> Target::ClearModules() -> SectionLoadList::Clear()`
> >
> > So, at the end of `LoadModules()` in `DynamicLoaderWasmDYLD::DidAttach()` the SectionLoadList is empty, and we need to set it again by calling `Target::.SetSectionLoadAddress()` again. This works but the duplication is ugly; is there a way to improve this?
>
>
> I hope so. :) This seems like a bug in `ProcessGDBRemote::LoadModules`. It seems wrong/wasteful/etc to do all this work to compute the section load addresses only to have them be thrown away by `SetExecutableModule`. Maybe all it would take is to reverse the order of these two actions, so that the load addresses persist? Can you try something like that?
I would be interested to see if this helps as well.
> On a side note, `ProcessGDBRemote::LoadModules` seems a bit internally inconsistent. At one place it claims that "The main executable will never be included in libraries-svr4", but then it goes on to set an executable module anyway. This could in fact be a clue as to why this problem hasn't showed up on other platforms -- if the remote does not send the executable, then SetExecutableModule is not called.
The logic in ProcessGDBRemote::LoadModules is bad, the executable should be set before any sections are loaded.
> On a side-side note, this may mean that you sending the main wasm file through `qXfer:libraries-svr4` may not be correct. However, fixing that would mean finding another way to communicate the main executable name/address. I don't think we currently have an easy way to do that so it may be better to fix `ProcessGDBRemote::LoadModules`, given that it "almost" supports executables.
Is there a "qXfer:executable"? Seems from the code in ProcessGDBRemote::LoadModules() that people were seeing the main executable in the list somewhere. Be a shame to not allow it.
> I am also worried about the fact that `SymbolFileDWARF::CalculateAbilities` requires the module to be "loaded". That shouldn't be normally required. That function does a very basic check on some sections, and this should work fine without those sections having a "load address", even if they are actually being loaded from target memory. I think this means there are some additional places where ObjectFileWasm should use `m_memory_addr` instead of something else...
So a lot of these problems might go away if the modify the ObjectFileWASM to "do the right thing" when the ObjectFile is from a live process. Object file instances know if there are from a live process because the ObjectFile members:
lldb::ProcessWP m_process_wp;
const lldb::addr_t m_memory_addr;
Will be filled in. So the object file doesn't need to have its sections loaded in a target in order to read section contents right? The object file can be asked to read its section data via ObjectFile::ReadSectionData(...) (2 variants).
So Pavel is correct, the sections don't need to be loaded for anything in SymbolFileDWARF::CalculateAbilities(...). The section list does need to be created and available during SymbolFileDWARF::CalculateAbilities(...). We just need to be able to get the section contents and poke around at the DWARF bits.
>
>
>> The second problem is that the Code Section needs to be initialized (in `ObjectFileWasm::CreateSections()`) with `m_file_addr = m_file_offset = 0`, and not with the actual file offset of the Code section in the Wasm file. If we set `Section::m_file_addr` and `Section::m_file_offset` to the actual code offset, the DWARF info does not work correctly.
>>
>> I have some doubts regarding the DWARF data generated by Clang for a Wasm target. Looking at an example, for a Wasm module that has the Code section at offset 0x57, I see this DWARF data:
>>
>> 0x0000000b: DW_TAG_compile_unit
>> […]
>> DW_AT_low_pc (0x0000000000000000)
>> DW_AT_ranges (0x00000000
>> [0x00000002, 0x0000000e)
>> [0x0000000f, 0x0000001a)
>> [0x0000001b, 0x00000099)
>> [0x0000009b, 0x0000011c))
>>
>>
>> The documentation <https://yurydelendik.github.io/webassembly-dwarf/#pc> says that //“Wherever a code address is used in DWARF for WebAssembly, it must be the offset of an instruction relative within the Code section of the WebAssembly file.”//
>> But is this correct? Shouldn't maybe code addresses be offset-ed by the file address of the Code section?
>
> That's interesting. I don't think that clang is really wrong/non-conforming here, but this choice plays a rather poorly with the way lldb handles object files (and how your remote presents them). In this case special casing the code section may be fine, but you should be aware that there are places in lldb which expect that the "load bias" (the delta between file and load addresses) is the same for all sections in a module (because that's how it works elsewhere), and they may not like this. However, if you have only one code section, I think you should be mostly fine.
>
> I do think that this can be handled in a better way (I'm pretty sure you don't need to zero out the file offset for instance), but we can wait with that until we resolve the first point...
So the DWARF plug-ins definitely expect the addresses in the DWARF to be file addresses where if you ask the module for its section list, it can look up this "file address" in the list, it will find a single section. If this is not the case, you will need to make a special SymbolFileDWARFWasm subclass and this will be very tricky as any address you get from the DWARF will need to be converted so that the lookup the section list will work.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D72751/new/
https://reviews.llvm.org/D72751
More information about the lldb-commits
mailing list