[lldb-dev] Advice on debugging DSP and Harvard architectures
Matthew Gardiner
mg11 at csr.com
Mon Jun 2 04:06:55 PDT 2014
Greg Clayton wrote:
> Addresses in LLDB are represented by
>
> class lldb_private::Address {
> lldb::SectionWP m_section_wp; ///< The section for the address, can be NULL.
> std::atomic<lldb::addr_t> m_offset; ///< Offset into section if \a m_section_wp is valid...
> }
>
> The section class:
>
> class lldb_private::Section {
> ObjectFile *m_obj_file; // The object file that data for this section should be read from
> lldb::SectionType m_type; // The type of this section
> lldb::SectionWP m_parent_wp; // Weak pointer to parent section
> ConstString m_name; // Name of this section
> lldb::addr_t m_file_addr; // The absolute file virtual address range of this section if m_parent == NULL,
> // offset from parent file virtual address if m_parent != NULL
> lldb::addr_t m_byte_size; // Size in bytes that this section will occupy in memory at runtime
> lldb::offset_t m_file_offset; // Object file offset (if any)
> lldb::offset_t m_file_size; // Object file size (can be smaller than m_byte_size for zero filled sections...)
> SectionList m_children; // Child sections
> bool m_fake:1, // If true, then this section only can contain the address if one of its
> // children contains an address. This allows for gaps between the children
> // that are contained in the address range for this section, but do not produce
> // hits unless the children contain the address.
> m_encrypted:1, // Set to true if the contents are encrypted
> m_thread_specific:1;// This section is thread specific
>
> };
>
> The section type "m_type" is one of:
>
> typedef enum SectionType
> {
> eSectionTypeInvalid,
> eSectionTypeCode,
> eSectionTypeContainer, // The section contains child sections
> eSectionTypeData,
> eSectionTypeDataCString, // Inlined C string data
> eSectionTypeDataCStringPointers, // Pointers to C string data
> eSectionTypeDataSymbolAddress, // Address of a symbol in the symbol table
> eSectionTypeData4,
> eSectionTypeData8,
> eSectionTypeData16,
> eSectionTypeDataPointers,
> eSectionTypeDebug,
> eSectionTypeZeroFill,
> eSectionTypeDataObjCMessageRefs, // Pointer to function pointer + selector
> eSectionTypeDataObjCCFStrings, // Objective C const CFString/NSString objects
> eSectionTypeDWARFDebugAbbrev,
> eSectionTypeDWARFDebugAranges,
> eSectionTypeDWARFDebugFrame,
> eSectionTypeDWARFDebugInfo,
> eSectionTypeDWARFDebugLine,
> eSectionTypeDWARFDebugLoc,
> eSectionTypeDWARFDebugMacInfo,
> eSectionTypeDWARFDebugPubNames,
> eSectionTypeDWARFDebugPubTypes,
> eSectionTypeDWARFDebugRanges,
> eSectionTypeDWARFDebugStr,
> eSectionTypeDWARFAppleNames,
> eSectionTypeDWARFAppleTypes,
> eSectionTypeDWARFAppleNamespaces,
> eSectionTypeDWARFAppleObjC,
> eSectionTypeELFSymbolTable, // Elf SHT_SYMTAB section
> eSectionTypeELFDynamicSymbols, // Elf SHT_DYNSYM section
> eSectionTypeELFRelocationEntries, // Elf SHT_REL or SHT_REL section
> eSectionTypeELFDynamicLinkInfo, // Elf SHT_DYNAMIC section
> eSectionTypeEHFrame,
> eSectionTypeOther
>
> } SectionType;
>
> So we see we have eSectionTypeCode and eSectionTypeData.
>
> This could be used to help make the correct reads if addresses fall within known address ranges that fall into sections within a binary.
Thanks for your help with this Greg. I am currently trying to understand
the above structures. Probably take some time before I get it all clear
in my head, though.
> I am guessing that there are code and data reads that are not found within any sections from files right?
I can't comment on your above question just yet, since I'm concentrating
figuring out how get a "disassemble" command (from lldb) to read from
the correct bus on our devices.
We are concerned that disassembling always reads from the device (not
from ELF), since:
1. we prefer to always read from the device for dis since it is easy
then to spot if our users have chosen the wrong elf file.
2. we may try to debug without symbol files. This is a corner case however.
3. we may encounter self-modifying code.
As a quick check I did try debugging a native 64-bit linux process on
linux, and when I invoked a simple disassemble from address command
(e.g. di -s 0x4004f0 -c 10), I did observe that the target's memory is read:
#0 lldb_private::Process::ReadMemoryFromInferior
#1 lldb_private::MemoryCache::Read
#2 lldb_private::Process::ReadMemory
#3 .lldb_private::Target::ReadMemory
...
#5 lldb_private::Disassembler::Disassemble
(I'll try debugging using a remote target, shortly, for comparision...)
Whilst debugging, I did observe that in the parameter:
"const Address &start_address" of #5 lldb_private::Disassembler::Disassemble
that the m_section_wp data is 0x0. In your reply, do you suggest that I
arrange that this data is populated with a valid section pointer whose
m_type is eSectionTypeCode?
>
> If so we would need to change all functions that take a "load address ("lldb::addr_t load_addr") into something that takes a load address + segment which should be a new struct type that like:
>
> ResolvedAddress {
> lldb::addr_t addr;
> lldb::segment_t segment;
> };
>
> Then all things like Read/Write memory in the process would need to be switched over to use the ResolvedAddress.
>
> The lldb_private::Address function that is:
>
> lldb::addr_t
> Address::GetLoadAddress (Target *target) const;
>
> Would now need to be switched over to:
>
> ResolvedAddress
> Address::GetLoadAddress (Target *target) const;
>
> We would need an invalid segment ID (like UINT32_MAX or UINT64_MAX) to indicate that there is no segment.
I couldn't find segment_t in my checkout. So I assume that you're
floating this as an idea for me to try out :-) I could certainly give it
a try with my working copy... and let you know how I get on.
Were you suggesting that the value of segment_t for our Harvard case
would be hard-coded somewhere in our Target code, and if the
m_section_wp of the Address object is a valid code section, then we'd
pull out this constant?
> So all in all this would be quite a big fix that would involve a lot of the code.
Indeed, but from my perspective probably a good way for me to learn more
of the code-base.
> This would be a big change. Not sure how other debuggers handle 24 bit bytes. It might be better to leave this as is and if you read 3 bytes of memory from your DSP, you get 9 bytes back.
>
> If a variable for your DSP is referenced in DWARF, what does the byte size show? The actual size in 8 bit bytes, or the size in 24 bit bytes?
I'm not sure on this one, Greg. I'm leaving it for one of my colleagues
to research this further, then get back.
> It would be great to enable support for these kinds of architectures in LLDB, and it will take some work, but we should be able to make it happen.
Indeed. I'll keep you posted with my progress on the above.
Matt
Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.
More information about the lldb-dev
mailing list