[lldb-dev] Advice on debugging DSP and Harvard architectures

Wed Jun 4 06:36:10 PDT 2014

Greg:

Thanks for all your code suggestions. I can't comment on them just now 
as I'm still single-stepping the existing code and trying to think it 
all through.

lldb-dev:

I can't help but think though that I'm going have problems in getting 
the lldb disassembler to work against a traditional harvard 
architecture. The kind of architecture I'm describing is one in which 
code and data have completely separate address spaces, so address 0 on 
the code bus is different than address 0 on the data bus.

So somewhere in my debugserver, I need to either invoke:

device_read_dm(buffer, address, length)
or
device_read_pm(buffer, address, length)

Where dm means "data bus" and pm means "program/code bus".

The problem, in my mind, is in this interface:
"disassemble --start-address <addr> --end-address <addr>"

(Since for a unified address space model e.g. when debugging an intel 
x86, there is no need to disambiguate between code and data).

In my company's current debugger, the request to disassemble, *always* 
results in a request to read the code bus. However, were I to port lldb 
to debug our architectures, this approach itself is not ideal, since in 
a generic debugger, though, I imagine that whilst for the most part the 
developer would want to disassemble real running code, there exists a 
corner case, (e.g. they are working on an interpreter) where they may 
want to disassemble from a piece of data (where they previously copied 
some code).

So I think I am stuck here. How do you see disassemble working in this 
scenario? Should a disassemble command always expect to read from memory 
originally mapped in from a code section? Or is that definition too 
restrictive?

More thoughts reveal that we'd have a similar issue with the "memory 
read" commands as here there is too no distinction between code and data.

So it seems that the best way for me to debug our architectures with 
lldb is to form a unified 64-bit address space (our chips currently have 
16-bit, 24-bit, 32-bit address spaces) and to set the top bit for code 
bus access.

Therefore if one of our users wants to disassemble from the code bus, 
they'd say
(lldb) di -s 0x80000000004004f0

but for data they'd say:
(lldb) di -s 0x4004f0

The following challenges then arise:

1. when the user disassembles using a function_name the address 
discovered from the ELF file would then need to set the correct bit 
before making the memory access.
2. when the PC is read from the chip, it would have to have this bit 
set, before it's value is presented to the user, or to a memory read 
function, to be consistent.
(3. I'm also imagining some issues affecting stack unwinds too, since 
the return address of a frame, read from the stack will, of course, 
require the offset to be applied prior to disassembling this frame.)

An internet search revealed similar issues when producing a debugger for 
AVR processors using gdb and eclipse:

http://avr-eclipse.sourceforge.net/wiki/index.php/Debugging#Harvard_Architecture

A similar solution was applied in this case, but this time the offset 
was applied to data addresses.

With the kind of issues/challenges I outlined above I think I may need 
to make some big changes in lldb. I'm wondering whether there is a 
convenient pre-defined abstraction layer. At first I thought that the 
"Target" class would be the right place to subclass, but it does not 
have the "pure virtuals" that I would expect to see. So I looked at the 
"Process" layer, which has the expected "pure virtuals", but 
unfortunately having both

class ProcessPOSIX : public Process
and
class ProcessGDBRemote : public Process

here, makes me think that either 1) this is the wrong place or 2) that 
the exact current positioning of ProcessGDBRemote in the hierarchy is wrong.

If you/anyone-in-the-list have any more input into my dilemma, I'd 
greatly appreciate your thoughts.

thanks
Matt

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.