[LLVMdev] MCJIT debugger registration interface.

Sun Aug 10 13:43:58 PDT 2014

I think this ignores the real problem with the MCJIT debugging interface: it doesn't give MCJIT clients any way of directly accessing and parsing the debug metadata. 

WebKit, and likely other non-C/C++ clients of MCJIT, will not want the MCJIT to register anything with the system debugger. Non-C languages usually have a different set of debugging interfaces and it's up to the client of LLVM to arrange to glue the debugging information that the MCJIT knows about to the debugging interface that the LLVM client knows about. The mcjit's current architecture makes this extremely awkward. 

This is part of a bigger problem in the MCJIT API: it is designed to work like an execution engine for C programs despite the fact that the most compelling use of MCJIT is a higher-tier JIT that is part of a mixed-mode or tiered runtime for non-C languages. Is there some client of the MCJIT that actually benefits from the MCJIT pretending to be an execution engine for C programs?  Is there a reason why this client should get more attention than the seemingly more compelling non-C use cases?

-Filip

> On Aug 1, 2014, at 6:10 PM, Lang Hames <lhames at gmail.com> wrote:
> 
> Hi All,
> 
> I'd like to revisit the MCJIT debugger-registration system, as the existing system has a few flaws, some of which are seriously problematic.
> 
> The 20,000 foot overview of the existing scheme (implemented in llvm/lib/ExecutionEngine/RuntimeDyld/GDBRegistrar.cpp and friends), as I understand it, is as follows:
> 
> We have two symbols in MCJIT that act as fixed points for the debugger to latch on to:
> 
> __jit_debug_register_code is a no-op function that the debugger can set a breakpoint on.  MCJIT will call this function to notify the debugger when an object file is loaded.
> 
> __jit_debug_descriptor is the head of a C linked list data structure that contains pointers to in-memory object files. The ELF/MachO headers of the in memory object files will have had their vaddrs fixed up by the JIT to point to where each of the linked sections reside in memory.
> 
> There are a couple of problems with this system: (1) Modifying object-file headers in-place violates some internal LLVM contracts. In particular, the object files may be backed by read-only memory. This has caused crashes in the JIT that have forced me to revert support for debugger registration on the MachO side (We really want to replace this on the ELF side soon too). (2) The JIT has no way of knowing whether a debugger is attached, which means keeping object files in memory even if they're not being used, just in case there an attached debugger that needs them.
> 
> We'd really like to come up with a system that doesn't have these drawbacks. That is, a system where the object files remain unmodified, and the JIT knows if/when a debugger attaches so that it can generate the relevant information on the fly.
> 
> It would be great if the debugger experts (and particularly anyone who has experience on both the debugger and the JIT side of things) could weigh in on these issues. In particular:
> 
> (1) Is there a reason we bake the vmaddrs into the object file headers, or could they just as easily be passed in a side-table so as to keep the object untouched?
> 
> (2) Is there a canonical way for the debugger to communicate to a JIT that it's interested in inspecting the JIT's output? If we're going to use breakpoints (or something like that) to signal to the debugger when objects have been linked, is it reasonable to have an API that the debugger can call in to to request the information it's looking for? If the JIT actually receives a call then it would give us a chance to lazily populate the necessary data structures.
> 
> Regards,
> Lang.
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev