[cfe-dev] On sizeof char, bytes, and bits in the C99 standard
Kenneth Boyd
zaimoni at zaimoni.com
Fri Jan 2 09:00:18 PST 2009
Török Edwin wrote:
> Hi,
>
> I always considered sizeof(char) = one byte = 8 bits.
> However reading the C99 standard (N1256.pdf), and especially the C99
> rationale (C99RationalV5.10.pdf) I see that the intent is to allow
> for platforms where one byte != 8 bits.
>
> For example:
> "(Thus, for instance, on a machine with 36-bit words, a byte can be
> defined to consist or 36 bits, these numbers being all the exact
> divisors of 36 which are not less than 8.)"
>
These machines are not hypothetical, although the standard does require,
of the historical conventions, the Multics convention (4 9-bit logical
chars packed into a 36-byte physical char).
> ....
> Section 3.6 defines byte: "NOTE 2 A byte is composed of a contiguous
> sequence of bits, the number of which is implementation-defined. The
> least significant bit is called the low-order bit; the most significant
> bit is called the high-order bit."
>
> Section 7.18.1.1 defines int8_t: "Thus, int8_t denotes a signed integer
> type with a width of exactly 8 bits."
>
Right -- when the typedef exists at all.
> This quote from C99Rationale V.5.10 " Thus, for instance, on a machine
> with 36-bit words, a byte can be defined to consist of 9, 12, 18, or 36
> bits, these numbers being all the exact divisors of 36 which are not
> less than 8.)" shows that the intent was to allow for a definition of
> byte that doesn't necessarily have 8 bits.
>
> However according this quote " These strictures codify the widespread
> presumption that any object can be treated as an array of characters,
> the size of which is given by the sizeof operator with that object’s
> type as its
> operand." I should be able to treat any objects (thus including int8_t
> type objects) as array of characters.
>
Yes, but int8_t is only guaranteed to exist on CHAR_BIT 8 machines that
use two's complement integers. Neither int8_t nor uint8_t are allowed to
exist on machines where CHAR_BIT!=8, due to the no padding bits
requirement and a rote calculation that the practical minimum possibly
compliant CHAR_BIT is 7.
In particular, C99 7.18.1.1p3:
"These types are optional. However, if an implementation provides
integer types with
widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed
types) that have a
two’s complement representation, it shall define the corresponding
typedef names." [uint8_t, uint16_t, uint32_t, int64_t, int8_t, int16_t,
int32_t, int64_t]"
On a machine with CHAR_BIT 9, a conforming implementation can, but need
not, provide uint9_t (but I would expect it to as a quality of
implementation issue).
Kenneth Boyd
More information about the cfe-dev
mailing list