[cfe-dev] On sizeof char, bytes, and bits in the C99 standard

Kenneth Boyd zaimoni at zaimoni.com
Fri Jan 2 09:00:18 PST 2009


Török Edwin wrote:
> Hi,
>
> I always considered sizeof(char) = one byte = 8 bits.
> However reading the C99 standard (N1256.pdf), and especially the C99
> rationale (C99RationalV5.10.pdf) I see that the intent is to allow
> for platforms where one byte != 8 bits.
>
> For example:
> "(Thus, for instance, on a machine with 36-bit words, a byte can be
> defined to consist or 36 bits, these numbers being all the exact
> divisors of 36 which are not less than 8.)"
>   
These machines are not hypothetical, although the standard does require, 
of the historical conventions, the Multics convention (4 9-bit logical 
chars packed into a 36-byte physical char).
> ....
> Section 3.6 defines byte: "NOTE 2 A byte is composed of a contiguous
> sequence of bits, the number of which is implementation-defined. The
> least significant bit is called the low-order bit; the most significant
> bit is called the high-order bit."
>
> Section 7.18.1.1 defines int8_t: "Thus, int8_t denotes a signed integer
> type with a width of exactly 8 bits."
>   
Right -- when the typedef exists at all.
> This quote from C99Rationale V.5.10 " Thus, for instance, on a machine
> with 36-bit words, a byte can be defined to consist of 9, 12, 18, or 36
> bits, these numbers being all the exact divisors of 36 which are not
> less than 8.)" shows that the intent was to allow for a definition of
> byte that doesn't necessarily have 8 bits.
>
> However according this quote " These strictures codify the widespread
> presumption that any object can be treated as an array of characters,
> the size of which is given by the sizeof operator with that object’s
> type as its
> operand."  I should be able to treat any objects (thus including int8_t
> type objects) as array of characters.
>   
Yes, but int8_t is only guaranteed to exist on CHAR_BIT 8 machines that 
use two's complement integers. Neither int8_t nor uint8_t are allowed to 
exist on machines where CHAR_BIT!=8, due to the no padding bits 
requirement and a rote calculation that the practical minimum possibly 
compliant CHAR_BIT is 7.

In particular, C99 7.18.1.1p3:
"These types are optional. However, if an implementation provides 
integer types with
widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed 
types) that have a
two’s complement representation, it shall define the corresponding 
typedef names." [uint8_t, uint16_t, uint32_t, int64_t, int8_t, int16_t, 
int32_t, int64_t]"

On a machine with CHAR_BIT 9, a conforming implementation can, but need 
not, provide uint9_t (but I would expect it to as a quality of 
implementation issue).

Kenneth Boyd




More information about the cfe-dev mailing list