[cfe-dev] Wide strings and clang::StringLiteral.

Chris Lattner clattner at apple.com
Fri Dec 5 08:10:17 PST 2008


On Dec 5, 2008, at 2:01 AM, Cédric Venet wrote:

>
>>> set && string contains only
>>> characters in the range 0-0x7f" and having a slow path for  
>>> everything else.
>>>
>>
>> Ah, right, you want to store the strings in UTF-8.  That seems  
>> fine; I
>> expect non-ASCII in strings is very rare
>
> For french programmes and probably other non-english language, non- 
> ASCII
> in strings is *not* very rare. Every accentued character is not ascii
> and Most of the french sentence will have at least one accentued  
> character.
>
>
> I just wanted to point this out, even if it is not very important.  
> (I am french but I write my programme in english, and I suppose more  
> and more people use internationalisation software, so the problem  
> may be small).

Sure, I'm not arguing for lack of functionality, just saying that we  
shouldn't worry about optimizing for that case yet.  The precentage of  
tokens that are string literals is also very low, the percent that  
would use high characters is even lower, even in French-speaking- 
countries (I suspect).

-Chris



More information about the cfe-dev mailing list