[Lldb-commits] [PATCH] D133534: Complete support of loading a darwin kernel over a live gdb-remote connection given the address of a mach-o fileset

Jason Molenda via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Thu Sep 8 21:59:41 PDT 2022


jasonmolenda marked an inline comment as done.
jasonmolenda added a comment.

Thanks Jim!  Really helpful feedback and questions.



================
Comment at: lldb/source/Core/DynamicLoader.cpp:191
   ModuleSP module_sp;
   PlatformSP platform_sp = process->GetTarget().GetPlatform();
   Target &target = process->GetTarget();
----------------
jingham wrote:
> This isn't your addition, but it seems dangerous to assume process will always be non-null here...
Thanks, I don't even know why this is here, I don't seem to be calling the platform anywhere.


================
Comment at: lldb/source/Core/DynamicLoader.cpp:233
   if (module_sp.get()) {
+    // Ensure the Target has an architecture set in case
+    // we need it while processing this binary/eh_frame/debug info.
----------------
jingham wrote:
> It seems weird that we would get all the way here, with a live process and everything, but not have a target architecture.  How does that happen?
I hit this with a macho corefile debugging case where the Target arch *could* have been set by the corefile's cputype / cpusubtype, but it hadn't been yet (and that may not be the most accurate arch), and this crashed when it created a DataExtractor and tried to get addrsize.  It's pretty early in the target's lifetime, so very little has been set up; it seemed simplest to make sure there's an architecture for the Target before I run code that I know may need it.


================
Comment at: lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp:724
+              ObjectFile::eStrataKernel) {
         if (m_uuid != exe_module_sp->GetUUID()) {
           // The user specified a kernel binary that has a different UUID than
----------------
jingham wrote:
> This second test is redundant, isn't it?
> 
This code is really not filling me with joy, I'm sure I wrote it a decade+ ago myself.  The idea was that once I know the actual running UUID of the kernel in this process, I want to remove any non-matching modules that may have been added manually by a user specifying the wrong kernel binary.  Otherwise we have extra kernel symbols loaded and symbol name lookups can get the wrong one.  Let me rewrite this to do a clearer pass over all the Target's modules and remove any kernels with non-matching UUIDs, instead of just doing this with the 0th module in the list (the "executable module").


================
Comment at: lldb/source/Plugins/Platform/MacOSX/PlatformDarwinKernel.cpp:747
+
+  // Give the generic methods, including possibly calling into DebugSymbols
+  // framework on macOS systems, a chance.
----------------
jingham wrote:
> Is this comment accurate anymore?  You only get to this code if module_spec.GetUUID is NOT valid, in which case DebugSymbols isn't going to find anything.
The old code behaved this way -- if the FileSpec in the ModuleSpec was empty or we had no UUID, we would call into the more generic `PlatformDarwin::GetSharedModule`.  If I had a FileSpec, if it was the special name "mach_kernel", I would look in the list of local mach kernels I'd found by scanning; else it was a kext BundleID and I would look up the kexts I scanned locally by that bundle ID.

In the new code path, I might not be able to set "mach_kernel" filename for the kernel because I might only have gotten it by a bare address.  So now I will check the UUIDs of all the kernels on the local system for a match.  Else I'll try the kexts by bundle ID.  If the "mach_kernel"/empty name finds a module, we return it directly.  Kexts return directly.

Then we fall through to calling `Platform::GetSharedModule` with whatever we had in the ModuleSpec -- a FileSpec maybe, a UUID maybe.  I believe this will be correct behavior.


================
Comment at: lldb/source/Plugins/Platform/MacOSX/PlatformDarwinKernel.cpp:928
+    return LLDB_INVALID_ADDRESS;
+  ModuleSP module_sp(new Module(ModuleSpec()));
+  ObjectContainerSP container_sp(
----------------
jingham wrote:
> Do you really need to pass in a module_sp with a pretty much totally bogus module in it?  That seems weird to me.  The Module constructor does a bunch of work, all of which is going to fail with an empty ModuleSpec, so at some point CreateMemoryInstance is going to replace whatever is in this module_sp with something real.  So what's the point of passing it something it's going to throw away?
Yeah, this is how the CreateMemoryInstance for ObjectFiles work too - `Process::ReadModuleFromMemory` creates a Module, possibly adding an ObjectFile from memory, and returns the ModuleSP.  In my case here, I'm not going to use/return the ObjectContainer being constructed, but Jonas' `ObjectContainerMachOFileset::CreateMemoryInstance` requires a Module that it locks before starting its operations, and adds an ObjectFile to it.  I think this is an artifact of following the ObjectFile API pattern, I checked with Jonas and he didn't want to deviate.


================
Comment at: lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp:2214
+            x.consume_front("0x");
+            if (llvm::to_integer(x, vmaddr, 16))
+              m_binary_addresses.push_back(vmaddr);
----------------
jingham wrote:
> llvm::to_integer(..., 16) doesn't handle strings that start with 0x?  That seems weak.
Looking at `llvm::consumeUnsignedInteger` in lib/Support/StringRef.cpp it will call `GetAutoSenseRadix()` if a radix isn't specified, it will see & advance past "0x" and work.  But if you also want to accept base16 numbers that aren't 0x prefixed, you have to specify the radix to `llvm::to_integer()` and the `GetAutoSenseRadix()` won't be called.  It only does one or the other, it doesn't accept a mix.


================
Comment at: lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp:597
           const bool notify = true;
+          if (GetTarget()
+                  .GetDebugger()
----------------
jingham wrote:
> I'm a little surprised we don't have a platform at this point?
> 
> What happens if we had a Platform at this point, but it wasn't the one that knew how to load this binary?  That seems awkward.
We do have a platform, but it's a Host platform.  We've just connected to a gdb remote serial protocol port -- it could be debugserver, it could be a jtag device, it could be a kernel debugger.  Once we get a real binary, we can start figuring out the best platform and dynamic linker, but we're starting with almost nothing here, so given this special binary address, we need to ask any of the platform plugins if they know what to do with it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133534/new/

https://reviews.llvm.org/D133534



More information about the lldb-commits mailing list