[LLVMbugs] [Bug 23895] New: cannot initialize arrays of type char16_t

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Fri Jun 19 09:10:46 PDT 2015


https://llvm.org/bugs/show_bug.cgi?id=23895

            Bug ID: 23895
           Summary: cannot initialize arrays of type char16_t
           Product: clang
           Version: unspecified
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: normal
          Priority: P
         Component: -New Bugs
          Assignee: unassignedclangbugs at nondot.org
          Reporter: javier_3 at runbox.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Consider the following code:

static const wchar_t a[]=u"foo";
static const char16_t b[]=u"foo";
static const char16_t c[4]=u"foo";
static const char16_t d[4]={'f','o','o',0};
static const char16_t * const e=u"foo";

void foo(){
    e=d;
}

All five declarations fail except that of d. The error messages are:

error: initializing wide char array with incompatible wide string literal
static const wchar_t a[]=u"foo";
                     ^
error: array initializer must be an initializer list
static const char16_t b[]=u"foo";
                      ^
error: array initializer must be an initializer list
static const char16_t c[4]=u"foo";
                      ^
error: cannot initialize a variable of type 'const char16_t *const' (aka 'const
unsigned short *const') with an lvalue of type 'const char16_t [4]'
static const char16_t * const e=u"foo";


The last one is really curious given that e=d; is accepted, and d is also a
const char16_t [4].
The first one is explicitly allowed by the C11 standard. In 6.7.9.15 we can
read:

An array with element type compatible with a qualified or unqualified version
of
wchar_t may be initialized by a wide string literal,

and in 6.4.5.3,

[...] A wide string literal is the same, except prefixed by the letter L, u, or
U.

So u"foo" is a wide string literal which, by explicit wording in 6.7.9
(Initializations) may be used for initializing an array of whcar_t elements.

The last one should be allowed, the same as e=d is, for the string literal
u"foo" is, again by 6.4.5 (par. 6):

The multibyte character sequence is then used to initialize an array of static
storage duration and length just sufficient to contain the sequence.

So it is an static array of type char16_t [4] that should be valid as the right
operand of = where the right hand side is of type const char16_t *.

This is a very annoying bug because it precludes the use of u-prefixed strings
for the constant strings of the program, hence for all the program (if they
need be of type wchar_t, then the functions taking them must be so, and so on).
In particular, I would have to undo all the changes like

static const char16_t* const ff[]={
    u"All",
    u"FormaP",
    u"Color",
    u"ColorP"
...
}

where I replaced wchar_t by char16_t and L" by u", and I have more than one
thousand, and all the functions that operate on wchar_t strings, which I have
changed to take / return char16_t strings.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150619/1db351c4/attachment.html>


More information about the llvm-bugs mailing list