[lldb-dev] Proposal: How to modify lldb for non 8-bit bytes for kalimba processors
Matthew Gardiner
mg11 at csr.com
Thu Aug 28 00:19:49 PDT 2014
Hi folks,
One of the challenges that I need to resolve regarding debugging kalimba
processors, is that certain variants have different notions of the size
(in bits) of a byte, compared to a lot of more mainstream processors.
What I'm referring to is the size of a minimum addressable unit, when
the processor accesses memory. For example, on a kalimba architecture
version 3, a "byte" (minimum addressable unit) from the data bus is
24-bits, so if the processor reads from address 8001 it reads 24-bits,
and from address 8002 the next 24-bits are read, and so on... (this also
means that for this variant a char, int, long, pointer are 24-bits in
size). For kalimba architecture version 4, however, we have the minimum
addressable unit being 8-bits, and correspondingly more "conventional"
sizes for primitive types.
I imagine that this will effect the kalimba lldb port is various ways.
The most obvious one, and hence the one I'd like to solve first, is that
way in which raw memory read/write are implemented. As an example when I
ask lldb to read 4 "bytes" (addressable units worth of data) from a
kalimba with 8-bit bytes I expect to see this:
(lldb) memory read --count 4 0x0328
0x00000328: 00 07 08 08 ....
(lldb)
However if target processor has 24-bit bytes then I expect the same
query to yield the following answer:
(lldb) memory read --count 4 0x0328
0x00000328: 000708 080012 095630 023480
....
(lldb)
Just considering the above scenario leads me to believe that my first
challenge is arranging for the remote protocol implementation (currently
Process/gdb-remote et al) to assume Nx host bytes (N being a
target-specific value) for each target byte accessed, and for the memory
read and formatting code (above) to behave correctly, given the
discrepancy between host and target byte sizes. I guess I'll see many
other challenges - for example, frame variable decode, stack unwind etc.
(but since *those* challenges require work on clang/llvm backend, and
CSR have no llvm person yet, I want to concentrate on raw memory access
first...)
For an added complication (since kalimba is a harvard architecture)
certain kalimba variants have differing addressable unit sizes for
memory on the code bus and data bus. Kalimba Architecture 5 has 8-bit
addressable code, and 24-bit addressable data...
My initial idea for how to start to address the above challenge is to
augment the CoreDefinition structure in ArchSpec.cpp as follows:
struct CoreDefinition
{
ByteOrder default_byte_order;
uint32_t addr_byte_size;
uint32_t min_opcode_byte_size;
uint32_t max_opcode_byte_size;
+ uint32_t code_byte_size;
+ uint32_t data_byte_size;
llvm::Triple::ArchType machine;
ArchSpec::Core core;
const char * const name;
};
Where code_byte_size and data_byte_size would specify the size in host
(8-bit) bytes the sizes of the minimum addressable units on the
referenced architectures. So, e.g.
For kalimba 3, with 24-bit data bytes and 32-bit code bytes we'd have
data_byte_size=3 and code_byte_size=4
For kalimba 4, with 8-bit data bytes and 8-bit code bytes we'd have
data_byte_size=1 and code_byte_size=1
So, then I'd update the g_core_definitions array within ArchSpec.cpp
accordingly, such that all non-kalimbas would have 1 as the setting for
the new datas and the kalimba entries would have those fields made to
match the architectures.
The ArchSpec class would then require the following accessors: uint32_t
GetCodeByteSize() and uint32_t GetDataByteSize(); to supply client code
with the required hints to correctly implement memory accesses.
My next plan would be to "massage" the code in the execution flow from
an (lldb) memory read invocation through to the gdb-remote comms until I
see the memory read examples I illustrated above, working for 8-bit and
24-bit data kalimba targets.
I'd appreciate all comments and opinions as to what I've described above
from the lldb community. Basically, I'm curious as to what people think
of the whole concept, e.g.
"You can't possibly do that, so many other architectures have 8-bit
bytes, and so this proposal would make them harder to enhance, for the
benefit of (currently) just kalimba"
"Yes, that's a good idea, lldb can accommodate the most unusual of
architectures"
And I'm also interested in technical comments, e.g. should an instance
of CoreDefinition be added to ArchSpec, or is just adding the extra
byte-size attributes sufficient... or if anyone thinks that modifying
gdb-remote is a bad idea, and that I should be creating kalimba process
abstractions (and factor out the common code)?
thanks
Matt
Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.
More information about the lldb-dev
mailing list