[lldb-dev] Proposal: How to modify lldb for non 8-bit bytes for kalimba processors

Fri Aug 29 05:11:44 PDT 2014

Based on some recent investigation, it looks as if I won't need to 
modify the CoreDefinition structure of ArchSpec.cpp. In a local change, 
I've added specification for the kalimba variants in the SubArchType of 
llvm::Triple. So it's now possible for me to implement

uint32_t ArchSpec::GetCodeByteSize() const
uint32_t ArchSpec::GetDataByteSize() const

by inspection of subarch field of the triple contained in ArchSpec.

However, I'd still appreciate some feedback on a more conceptual level 
regarding this proposal.

thanks
Matt

Matthew Gardiner wrote:
> Hi folks,
>
> One of the challenges that I need to resolve regarding debugging 
> kalimba processors, is that certain variants have different notions of 
> the size (in bits) of a byte, compared to a lot of more mainstream 
> processors. What I'm referring to is the size of a minimum addressable 
> unit, when the processor accesses memory. For example, on a kalimba 
> architecture version 3, a "byte" (minimum addressable unit) from the 
> data bus is 24-bits, so if the processor reads from address 8001 it 
> reads 24-bits, and from address 8002 the next 24-bits are read, and so 
> on... (this also means that for this variant a char, int, long, 
> pointer are 24-bits in size). For kalimba architecture version 4, 
> however, we have the minimum addressable unit being 8-bits, and 
> correspondingly more "conventional" sizes for primitive types.
>
> I imagine that this will effect the kalimba lldb port is various ways. 
> The most obvious one, and hence the one I'd like to solve first, is 
> that way in which raw memory read/write are implemented. As an example 
> when I ask lldb to read 4 "bytes" (addressable units worth of data) 
> from a kalimba with 8-bit bytes I expect to see this:
>
> (lldb) memory read --count 4 0x0328
> 0x00000328: 00 07 08 08                                      ....
> (lldb)
>
> However if target processor has 24-bit bytes then I expect the same 
> query to yield the following answer:
>
> (lldb) memory read --count 4 0x0328
> 0x00000328: 000708 080012 095630 023480 
>                                      ....
> (lldb)
>
> Just considering the above scenario leads me to believe that my first 
> challenge is arranging for the remote protocol implementation 
> (currently Process/gdb-remote et al) to assume Nx host bytes (N being 
> a target-specific value) for each target byte accessed, and for the 
> memory read and formatting code (above) to behave correctly, given the 
> discrepancy between host and target byte sizes. I guess I'll see many 
> other challenges - for example, frame variable decode, stack unwind 
> etc. (but since *those* challenges require work on clang/llvm backend, 
> and CSR have no llvm person yet, I want to concentrate on raw memory 
> access first...)
>
> For an added complication (since kalimba is a harvard architecture) 
> certain kalimba variants have differing addressable unit sizes for 
> memory on the code bus and data bus. Kalimba Architecture 5 has 8-bit 
> addressable code, and 24-bit addressable data...
>
> My initial idea for how to start to address the above challenge is to 
> augment the CoreDefinition structure in ArchSpec.cpp as follows:
>
>     struct CoreDefinition
>     {
>         ByteOrder default_byte_order;
>         uint32_t addr_byte_size;
>         uint32_t min_opcode_byte_size;
>         uint32_t max_opcode_byte_size;
> +       uint32_t code_byte_size;
> +       uint32_t data_byte_size;
>         llvm::Triple::ArchType machine;
>         ArchSpec::Core core;
>         const char * const name;
>     };
>
> Where code_byte_size and data_byte_size would specify the size in host 
> (8-bit) bytes the sizes of the minimum addressable units on the 
> referenced architectures. So, e.g.
> For kalimba 3, with 24-bit data bytes and 32-bit code bytes we'd have 
> data_byte_size=3 and code_byte_size=4
> For kalimba 4, with 8-bit data bytes and 8-bit code bytes we'd have 
> data_byte_size=1 and code_byte_size=1
>
> So, then I'd update the g_core_definitions array within ArchSpec.cpp 
> accordingly, such that all non-kalimbas would have 1 as the setting 
> for the new datas and the kalimba entries would have those fields made 
> to match the architectures.
>
> The ArchSpec class would then require the following accessors: 
> uint32_t GetCodeByteSize() and uint32_t GetDataByteSize(); to supply 
> client code with the required hints to correctly implement memory 
> accesses.
>
> My next plan would be to "massage" the code in the execution flow from 
> an (lldb) memory read invocation through to the gdb-remote comms until 
> I see the memory read examples I illustrated above, working for 8-bit 
> and 24-bit data kalimba targets.
>
> I'd appreciate all comments and opinions as to what I've described 
> above from the lldb community. Basically, I'm curious as to what 
> people think of the whole concept, e.g.
>
> "You can't possibly do that, so many other architectures have 8-bit 
> bytes, and so this proposal would make them harder to enhance, for the 
> benefit of (currently) just kalimba"
> "Yes, that's a good idea, lldb can accommodate the most unusual of 
> architectures"
>
> And I'm also interested in technical comments, e.g. should an instance 
> of CoreDefinition be added to ArchSpec, or is just adding the extra 
> byte-size attributes sufficient... or if anyone thinks that modifying 
> gdb-remote is a bad idea, and that I should be creating kalimba 
> process abstractions (and factor out the common code)?
>
> thanks
> Matt
>
>
>
> Member of the CSR plc group of companies. CSR plc registered in 
> England and Wales, registered number 4187346, registered office 
> Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 
> 0WZ, United Kingdom
> More information can be found at www.csr.com. Keep up to date with CSR 
> on our technical blog, www.csr.com/blog, CSR people blog, 
> www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, 
> www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at 
> www.twitter.com/CSR_plc.
> New for 2014, you can now access the wide range of products powered by 
> aptX at www.aptx.com.
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>
>
> To report this email as spam click 
> https://www.mailcontrol.com/sr/nergAztaJijGX2PQPOmvUs3i7gRn49Hg+ad6BErjJilinO59sjEAHiLtU+YE5asZgsAG+GH7HC!3KeH4!zzvzA== 
> .