[cfe-dev] thoughts about n-bit bytes for clang/llvm
Ray Fix
rayfix.ml at gmail.com
Fri Sep 4 07:24:49 PDT 2009
Hello experts,
I am new to Clang I would like to support a system on chip where the
smallest accessible data type is 16-bits. In other words sizeof(char)
== 1 byte == 16 bits. My understanding is that C/C++ only requires 1
byte >= 8-bits and sizeof(char) <= sizeof(short) <= sizeof(int) <=
sizeof(long) <= sizeof(long long).
In clang/TargetInfo.h:
unsigned getBoolWidth(bool isWide = false) const { return 8; } // FIXME
unsigned getBoolAlign(bool isWide = false) const { return 8; } // FIXME
unsigned getCharWidth() const { return 8; } // FIXME
unsigned getCharAlign() const { return 8; } // FIXME
:
unsigned getShortWidth() const { return 16; } // FIXME
unsigned getShortAlign() const { return 16; } // FIXME
These are easy enough to fix and to make them configurable the same as
IntWidth and IntAlign are.
There are two consequences that I am aware of that arise because of
this change.
The first is in preprocessor initialization. InitPreprocessor defines
__INT8_TYPE__, __INT16_TYPE_, __INT32_TYPE__, and sometimes
__INT64_TYPE__. It only defines INT64 if sizeof(long long) is 64
which seems odd to me.
// 16-bit targets doesn't necessarily have a 64-bit type.
if (TI.getLongLongWidth() == 64)
DefineType("__INT64_TYPE__", TI.getInt64Type(), Buf);
In my case, __INT8_TYPE__ and __INT64_TYPE__ don't exist so it doesn't
really make sense to define them.
I think a better way of generating these definitions would be to say
the following (psuedo-code, it doesn't actually compile)
// Define types for char, short, int, long, long long
DefineType( "__INT" + TI.getCharWidth()) + "_TYPE__", TI.getCharWidth
());
if (TI.getShortWidth() > TI.getCharWidth())
DefineType( "__INT" + TI.getShortWidth() + "_TYPE__",
TI.getShortWidth());
if (TI.getIntWidth() > TI.getShortWidth())
DefineType( "__INT" + TI.getIntWidth() + "_TYPE__", TI.getIntWidth
());
if (TI.getLongWidth() > TI.getIntWidth())
DefineType( "__INT" + TI.getLongWidth() + "_TYPE__",
TI.getLongWidth());
if (TI.getLongLongWidth() > TI.getLongWidth())
DefineType( "__INT" + TI.getLongLongWidth() + "_TYPE__",
TI.getLongLongWidth());
This would result in the creation of __INT8_TYPE__, __INT16_TYPE__,
__INT32_TYPE__, __INT64_TYPE__ for most platforms. For my platform it
would only create __INT16_TYPE__, __INT32_TYPE__. It would also work
for wacky 9-bit machines and where INT8s don't make much sense and
architectures where long long was 128 bits.
The other place I am aware of (thanks to useful assertion) that makes
a difference is in Lex/LiteralSupport.cpp for the char literal
parser. I am still wrapping my head around this, but I think fixing
it for arbitrary size is doable. (As a new person, I need to figure
out a good way to test it.)
Do these changes seem reasonable to pursue? What other things are
broken in Clang and LLVM by changing the assumption about 8-bit bytes?
Your advice is appreciated,
Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20090904/0acde8c8/attachment.html>
More information about the cfe-dev
mailing list