[llvm-dev] RFC: On non 8-bit bytes and the target for it
Dmitriy Borisenkov via llvm-dev
llvm-dev at lists.llvm.org
Thu Oct 31 06:41:12 PDT 2019
> So, if I understand correctly, your memory is a key-value store where the
keys are 257-bit values and the values are arrays of 257-bit values?
Both the keys and the values are 257-bits wide:
- A pointer to an object is 257 bits integer.
- The same as a pointer to a field of an object.
- And an arbitrary void* is also 257 bits wide integer.
- "Hello, world" is an array of 257-bit characters.
It's indeed redundant for letters and pointers to occupy that much space.
However, a realistic contract that is able to run on a virtual machine
without exceeding gas limits can't use strings and memory extensively. So
we've chosen the simplest implementation possible. If other targets that
have non-8-bits byte pack multiple 8-bit characters into a single byte and
it's convenient for the community to maintain this kind of design, we
probably can reimplement strings this way too.
Persistent data, which is kept in the blockchain is more compact, but it
requires explicit intrinsic calls to deserialize data and then the
programmer is able to manipulate with it as with 257-bits integers.
On Thu, Oct 31, 2019 at 1:48 PM David Chisnall <David.Chisnall at cl.cam.ac.uk>
> On 31/10/2019 11:17, Dmitriy Borisenkov wrote:
> > David, just to clarify a misconception I might have introduced, we do
> > not have linear memory in the sense that all data is stored as a trie.
> > We do support arrays, structures and GEPs, however, as well as all
> > relevant features in C by modeling memory.
> So, if I understand correctly, your memory is a key-value store where
> the keys are 257-bit values and the values are arrays of 257-bit values?
> Or they values are 257-bit values? To help the discussion, please can
> you explain how the following are represented:
> - A pointer to an object.
> - A pointer to a field in an object.
> - An arbitrary void*.
> - The C string "hello world"
> > So regarding concepts of byte, all 5 statements you gave are true for
> > our target. Either due to the specification or because of
> > performance (gas consumption) issues. But if there are architectures
> > that need less from the notion of byte, we should try to figure out the
> > common denominator. It's probably ok to be less restrictive about a byte.
> It seems odd to encode a C string as an array of 257-bit values, rather
> than as an array of 8-bit values that are stored in 32-char chunks.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev