[cfe-commits] [Review] Rolling out ASTContext::getTypeSizeInChars()

Chris Lattner clattner at apple.com
Tue Jan 12 10:17:41 PST 2010


On Jan 12, 2010, at 5:02 AM, Ken Dyck wrote:

> On Monday, January 11, 2010 4:50 PM, Ted Kremenek wrote:
>>
>> On Jan 11, 2010, at 1:24 PM, Ken Dyck wrote:
>>
>>>>
>>>> I'm also concerned about the dimensionality here.  Why did
>> we choose
>>>> 'Chars' instead of 'Bytes'?
>>>
>>> The short answer is that it reflects how getTypeSizeInChars()
>>> calculates its value. It divides the bit size of the type
>> by the bit
>>> size of the char type, so calling them CharUnits seemed
>> more accurate
>>> than ByteUnits. The aim is to eventually support character widths
>>> other than 8.
>>>
>>> What specifically are you concerned about?
>>
>> I'm concerned that the uses of getTypeSize() / 8 always want
>> the size in bytes, not chars (if the size of chars differs
>> from the size of bytes).  Code that expects
>> getTypeSizeInChars() to return the size in bytes (which is
>> all the cases in libAnalysis) will get the wrong results.
>
> Just to get the terminology straight here, when we are talking about
> bytes do we mean:
>
>  A. an 8-bit value,
>  B. the smallest addressable unit of memory on a machine, or
>  C. an addressable unit of data storage large enough to hold any  
> member
> of the basic character set of the execution environment (C99), or
>  D. something else?
>
> However we define byte it seems that it is at least theoretically
> possible for the character type to have a different width, and so I
> think Ted makes a valid point. If there is code that expects a size in
> bytes (however defined), perhaps we need to add another API.

In clang, I prefer to avoid the term 'byte'.  CharUnits is great  
because it specifically says it is in units of char. :)

> As a clang newbie, it is difficult to determine whether a literal 8
> means the width of a byte or that of a character, so I'm relying on  
> you
> guys for reviews. So far, I have been approaching the problem with
> definition C above and the simpifying assumption that clang enforces
> byte width == char width, even if neither are 8. This allows  
> characters
> and bytes to be used interchangably.

To date, LLVM only supports 8-bit byte targets whose char's are 8 bits.

-Chris 



More information about the cfe-commits mailing list