[LLVMdev] LLVM supports Unicode?

Erik de Castro Lopo mle+cl at mega-nerd.com
Sun Aug 28 15:41:11 PDT 2011


geovanisouza92 at gmail.com wrote:

> I'm trying create a new programming language, and I want that it have
> Unicode support (support for read and manipulate rightly the source-code and
> string literals).

LLVM IR iteself only supports one string ty, which is an array of
i8 (8 bit integers). In your compile you can use utf-8 and any
utf8 string literal can be stored in an i8 array in the LLVM IR.

For example, the LLVM backend for the DDC compiler [0] does this:

   @str = internal constant [4 x i8] c"bar\00", align 8


HTH,
Erik

[0] http://disciple.ouroborus.net/
-- 
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/



More information about the llvm-dev mailing list