[Lldb-commits] [lldb] r221213 - Fixed SBTarget::ReadMemory() to work correctly and the TestTargetAPI.py test case that was reading target memory in TargetAPITestCase.test_read_memory_with_dsym and TargetAPITestCase.test_read_memory_with_dwarf.

Jason Molenda jmolenda at apple.com
Tue Nov 4 14:37:07 PST 2014


FWIW the use of an Address object to represent addresses was motivated by the hassle of address handling we had with gdb.  If you start the debugger and give it an executable and a bunch of solibs, set an address breakpoint, and then run the process, how does that address breakpoint get re-set to the correct place when the executable and all the solibs land at their final address?  What if multiple solibs are at the same addr (say 0x0) before we start execution?  (on Mac OS X solibs don't have distinct virtual addresses these days - they're all 0-based and the dynamic loader picks a spot in the address space at runtime)

The Address object represents addresses as a section + offset-into-that-section, if it's within the bounds of a known binary (a Module).  When the process starts up, lldb learns where all the solibs are actually loaded from the dynamic loader -- it records where each Section is loaded in memory as part of the Target.  So given an Address object, we can get the actual address in real memory via the Target's section load list.

This abstraction solves a lot of subtle bugs we'd hit in gdb when an objfile would shift in memory to a new address -- we'd have to go through all the addresses that might have that old address and make sure they're updated correctly to the new one.  With lldb we have the section load list in the Target which has the current address for each Section.

When we're getting a real memory address (a "load address", or as Greg was saying to me earlier, think of it as a "live address" if that helps) out of the process (e.g. we read the pc register), usually the first thing we'll do is convert that into an Address object (again, with help of the Target's section load list) putting it in terms of a Section and an offset.

When a real memory address in the process doesn't belong to any of the binaries (Modules) -- for instance, a pointer into a heap allocation -- then we can put it in an Address object but it's just an offset alone with no section.  But addresses that correspond to a function or a symbol are expressed in terms of the containing section and offset into that section.



> On Nov 4, 2014, at 10:18 AM, Greg Clayton <gclayton at apple.com> wrote:
> 
> 
>> On Nov 3, 2014, at 10:53 PM, Matthew Gardiner <mg11 at csr.com> wrote:
>> 
>> Hi Greg,
>> 
>> So what in lldb's world is the difference between a file and a load
>> address?
> 
> 
> "file address" (as LLDB treats them) is the address that is found in the object file (ELF, MachO, or COFF). "load address" is the actual address in the process where the section is loaded after being slid.
> 
> For example you might have a shared library that has a function "foo" whose file address is 0x1000, but when the shared library gets loaded into memory, it will be loaded at a different address because all shared libraries have functions in the low file address range (say from 0 to 0x400000 for example). So if the shared library gets loaded with a slide of 0x1000000, foo will have a load address of 0x1000000 + 0x1000.
> 
> Now for most embedded debugging where there is no OS that will slide things around, your file and load addresses will match. For actual OS level debugging, they won't for shared libraries, and might for the main executable. Sometimes the main executable doesn't get slid around, but other OSs will use ASLR to slide the main executables around for security.
> 
> The test that was written was trying to get a data section from the object file. Then it made a section offset address that pointed to that data section. Then it called SBTarget::ReadMemory(...) with that section offset address. What was happening on MacOSX was:
> 
> - get the data section whose file address was 0x10001000
> - Make a section + offset address from it that was represented as a.out's data section + 0 bytes
> - call target read memory
> 
> Prior to my fix this happened:
> 
> SBTarget::ReadMemory() made a new section offset address:
> 
> Address address(section_offset_addr.GetFileAddress(), NULL);
> 
> Now we have an address that has no section. If such an address is passed to anything that is trying to read memory from a live process, an address with no section is considered to be a "load address". It will try and read from the "load address" 0x10001000. But on MacOSX, or any OS with ASLR, the data section was slid by a random amount (like 0xef0000). So we would try to read from "load address" 0x10001000 and it would fail. If we leave the address object as a section offset address (don't make a new address like we did above), we pass this address to a read memory function and it will resolve the section offset address into an address in the live process, or into a load address. This will be 0x10001000 + 0xef0000 + 0. The load address of the data section is 0x10001000 + 0xef0000 and the offset was 0. And the resulting memory it will read from in the process is 0x10ef1000. The old way it would have tried to read from 0x10001000 which was incorrect.
> 
> 
>> In my world I consider the file and load addresses to be the
>> same thing, that is, the address (not the file offset) of the symbol in
>> the object file, e.g. when I objdump symbols and grep for ones I know of
>> in a kalimba ELF, I get
>> 
>> 0000054f g       DM|0	00000000 $_g_matt1
>> ...
>> 
>> 
>> 0x54f as the file address of g_matt1.
>> 
>> 
>> The other address terminology I hear of is "virtual address". To me this
>> the address of the symbol once the binary is actually running on the
>> processor. So in some embedded scenarios (like kalimba where there is no
>> OS) we have code addresses in the ELF (i.e. file/load address) all
>> starting at 80000000 e.g.
>> 
>> 80000354 g     F PM|0	00000000 $_main
>> 
>> But on the device (since it's harvard architecture with a CODE and DATA
>> bus), main is actually at 0x0354. So in this context I'd say 0x80000354
>> was the "load/file address" but 0x0354 was the virtual address. I see
>> similar scenario with linux shared object files where in the file the
>> symbol addresses are often based at 0, but at runtime are fixed-up to
>> some arbitrary offset.   
>> 
>> Can you explain what lldb means by file/load/virtual and so on
>> addresses?   
> 
> 
> So in your terms:
> 
> virtual address is what we call the "load address".
> file address means address as it is found in the object file you loaded it from.
> 
> Many object files speak of a virtual address when they are speaking of file addresses, so I didn't think "virtual address" made as much sense as "load address". For example the mach-o segments have a "vmaddr" and "vmsize" fields when parsing the segments which stand for virtual address and virtual size.
> 
> Hope that clears things up. 
> 
>> thanks    
>> Matt
>> 
>> 
>> 
>> On Tue, 2014-11-04 at 00:56 +0000, Greg Clayton wrote:
>>> Author: gclayton
>>> Date: Mon Nov  3 18:56:30 2014
>>> New Revision: 221213
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=221213&view=rev
>>> Log:
>>> Fixed SBTarget::ReadMemory() to work correctly and the TestTargetAPI.py test case that was reading target memory in TargetAPITestCase.test_read_memory_with_dsym and TargetAPITestCase.test_read_memory_with_dwarf.
>>> 
>>> The problem was that SBTarget::ReadMemory() was making a new section offset lldb_private::Address by doing:
>>> 
>>> 
>>> size_t
>>> SBTarget::ReadMemory (const SBAddress addr,
>>>                     void *buf,
>>>                     size_t size,
>>>                     lldb::SBError &error)
>>> {
>>>       ...
>>>       lldb_private::Address addr_priv(addr.GetFileAddress(), NULL);
>>>       bytes_read = target_sp->ReadMemory(addr_priv, false, buf, size, err_priv);
>>> 
>>> 
>>> This is wrong. If you get the file addresss from the "addr" argument and try to read memory using that, it will think the file address is a load address and it will try to resolve it accordingly. This will work fine if your executable is loaded at the same address (no slide), but it won't work if there is a slide.
>>> 
>>> The fix is to just pass along the "addr.ref()" instead of making a new addr_priv as this will pass along the lldb_private::Address that is inside the SBAddress (which is what we want), and not always change it into something that becomes a load address (if we are running), or abmigious file address (think address zero when you have 150 shared libraries that have sections that start at zero, which one would you pick). The main reason for passing a section offset address to SBTarget::ReadMemory() is so you _can_ read from the actual section + offset that is specified in the SBAddress. 
>>> 
>>> 
>>> 
>>> Modified:
>>>   lldb/trunk/source/API/SBTarget.cpp
>>>   lldb/trunk/test/python_api/target/TestTargetAPI.py
>>> 
>>> Modified: lldb/trunk/source/API/SBTarget.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/lldb/trunk/source/API/SBTarget.cpp?rev=221213&r1=221212&r2=221213&view=diff
>>> ==============================================================================
>>> --- lldb/trunk/source/API/SBTarget.cpp (original)
>>> +++ lldb/trunk/source/API/SBTarget.cpp Mon Nov  3 18:56:30 2014
>>> @@ -1306,13 +1306,11 @@ SBTarget::ReadMemory (const SBAddress ad
>>>    if (target_sp)
>>>    {
>>>        Mutex::Locker api_locker (target_sp->GetAPIMutex());
>>> -        lldb_private::Address addr_priv(addr.GetFileAddress(), NULL);
>>> -        lldb_private::Error err_priv;    
>>> -        bytes_read = target_sp->ReadMemory(addr_priv, false, buf, size, err_priv);
>>> -        if(err_priv.Fail())
>>> -        {
>>> -            sb_error.SetError(err_priv.GetError(), err_priv.GetType());
>>> -        }
>>> +        bytes_read = target_sp->ReadMemory(addr.ref(), false, buf, size, sb_error.ref());
>>> +    }
>>> +    else
>>> +    {
>>> +        sb_error.SetErrorString("invalid target");
>>>    }
>>> 
>>>    return bytes_read;
>>> 
>>> Modified: lldb/trunk/test/python_api/target/TestTargetAPI.py
>>> URL: http://llvm.org/viewvc/llvm-project/lldb/trunk/test/python_api/target/TestTargetAPI.py?rev=221213&r1=221212&r2=221213&view=diff
>>> ==============================================================================
>>> --- lldb/trunk/test/python_api/target/TestTargetAPI.py (original)
>>> +++ lldb/trunk/test/python_api/target/TestTargetAPI.py Mon Nov  3 18:56:30 2014
>>> @@ -213,16 +213,20 @@ class TargetAPITestCase(TestBase):
>>>        breakpoint = target.BreakpointCreateByLocation("main.c", self.line_main)
>>>        self.assertTrue(breakpoint, VALID_BREAKPOINT)
>>> 
>>> +        # Put debugger into synchronous mode so when we target.LaunchSimple returns
>>> +        # it will guaranteed to be at the breakpoint
>>> +        self.dbg.SetAsync(False)
>>> +        
>>>        # Launch the process, and do not stop at the entry point.
>>>        process = target.LaunchSimple (None, None, self.get_process_working_directory())
>>> 
>>>        # find the file address in the .data section of the main
>>>        # module            
>>>        data_section = self.find_data_section(target)
>>> -        data_section_addr = data_section.file_addr
>>> -        a = target.ResolveFileAddress(data_section_addr)
>>> -
>>> -        content = target.ReadMemory(a, 1, lldb.SBError())
>>> +        sb_addr = lldb.SBAddress(data_section, 0)
>>> +        error = lldb.SBError()
>>> +        content = target.ReadMemory(sb_addr, 1, error)
>>> +        self.assertTrue(error.Success(), "Make sure memory read succeeded")
>>>        self.assertEquals(len(content), 1)
>>> 
>>>    def create_simple_target(self, fn):
>>> 
>>> 
>>> _______________________________________________
>>> lldb-commits mailing list
>>> lldb-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>>> 
>>> 
>>> To report this email as spam click https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ== .
>> 
>> 
>> 
>> 
>> Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
>> More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
>> New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.
> 
> 
> _______________________________________________
> lldb-commits mailing list
> lldb-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits





More information about the lldb-commits mailing list