[lldb-dev] LLDB's disassemble function

Fri Jul 31 13:37:58 PDT 2020

> On Jul 30, 2020, at 8:07 PM, Rui Hong via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> 
> Hi LLDB devs,
> 
> I have almost finished porting LLDB to my architecture, now LLDB communicates well with my GDB stub of my simulator and can do debugging actions like breakpoint, continue, step, reading memory, reading registers and so on. Thanks to all of your kind advice~
> 
> Now I consider adding disassemble function. Our compiler is LLVM(of course after porting it to our new architecture) and a LLVM disassembler has already been implemented, which can do great objdump from the ELF file that LLVM generates, creating a .map file containing disassemble. LLDB leverages the disassembler from LLVM, so our LLDB can easily use that disassembler plug-in. Here comes problems:
> I found that when LLDB deals with command "disassemble/dis/di", it would detect the current frame, confirm the disassemble address range, send a "m" packet to read the corresponding memory to get the program, and use LLVM disassembler plug-in to disassemble the code, which I would like to call it "dynamic disassembling". But in our architecture, program and data are in separated memory which have the same address space. When reading memory, we just read the data memory not the program memory, in other words, program/code cannot be accessed from processor, only from ELF file(program/code won't be altered during run time, we use hardware breakpoint). 
> 
> So:
> Can LLDB do "static disassembling"? (which just uses program/code from the ELF executable file without reading memory from the processor during run time)

Yes it can. There are two ways to read memory:

size_t Target::ReadMemory(const Address &addr, bool prefer_file_cache,
                          void *dst, size_t dst_len, Status &error,
                          lldb::addr_t *load_addr_ptr) {

size_t Process::ReadMemory(addr_t addr, void *buf, size_t size, Status &error)

Process::ReadMemory always comes from the program you are debugging and will send a packet to the GDB remote if we are using ProcessGDBRemote.

Target::ReadMemory has the option to read from sections in the object file if "prefer_file_cache" is set to true.

In the "disassemble" command the following function does the work in lldb/source/Commands/CommandObjectDisassemble.cpp:

bool CommandObjectDisassemble::DoExecute(Args &command, CommandReturnObject &result)

It will call this function in lldb/source/Core/Disassembler.cpp:

bool Disassembler::Disassemble(Debugger &debugger, const ArchSpec &arch,
                               const char *plugin_name, const char *flavor,
                               const ExecutionContext &exe_ctx,
                               const Address &address, Limit limit,
                               bool mixed_source_and_assembly,
                               uint32_t num_mixed_context_lines,
                               uint32_t options, Stream &strm);

Which then calls this function:

size_t Disassembler::ParseInstructions(Target &target, Address start,
                                       Limit limit, Stream *error_strm_ptr,
                                       bool prefer_file_cache);

Back in Disassembler::Disassemble(...) the call site looks like:

  const bool prefer_file_cache = false;
  size_t bytes_disassembled = disasm_sp->ParseInstructions(
      exe_ctx.GetTargetRef(), address, limit, &strm, prefer_file_cache);

So we are hard coding it to always read from memory. 

One way to fix this is to ask the lldb_private::Process subclass (ProcessGDBRemote in this case) if disassembly should prefer the file cache. For almost all architectures we want this to be false. For yours we want this to return true. The question now is how to detect this. One idea is to add key/value pairs to the response to qHostInfo or qProcessInfo queries that LLDB sends to the remote stub. For example, the darwin GDB server "debugserver" responds when debugging a simple a.out program with:

 <  13> send packet: $qHostInfo#9b
 < 166> read packet: $cputype:16777223;cpusubtype:8;ostype:macosx;watchpoint_exceptions_received:after;vendor:apple;os_version:10.15.5;maccatalyst_version:13.5;endian:little;ptrsize:8;#00

 <  16> send packet: $qProcessInfo#dc
 < 193> read packet: $pid:13cad;parent-pid:13cb0;real-uid:24069482;real-gid:6fd32dba;effective-uid:24069482;effective-gid:6fd32dba;cputype:1000007;cpusubtype:8;ptrsize:8;ostype:macosx;vendor:apple;endian:little;#00

So it wouldn't be hard to add an extra key/value pair to either of these and then store them in the ProcessGDBRemote class. We would need to add a new virtual function to lldb_private::Process:

class Process {
  virtual bool PreferFileCacheForDisassembly() const {
    return false;
  }

};

Then we would override this in ProcessGDBRemote and it would return the right value.

I am CC'ing in a few other GDB server experts directly for more feedback.

Greg

> 
> Kind regards,
> Rui
> 
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20200731/834b7454/attachment.html>