[cfe-dev] Clang's string type?

Friedman, Eli via cfe-dev cfe-dev at lists.llvm.org
Thu Aug 23 16:06:36 PDT 2018


On 8/23/2018 3:27 PM, Marcus Johnson via cfe-dev wrote:
>
>
>     Thanks for the link to that thread Tim.
>
>
> Eli: "I don't follow; can't you just convert the format string from 
> UTF-16/UTF-32 to UTF-8 before checking it?  (Granted, that's not 
> particularly efficient, but it's rare enough that it probably doesn't 
> matter.)"
>
>     and I realized a bit after posting this that converting the format
>     strings from UTF-16/wchar, to UTF-8 would probably be the best way
>     to achieve this Eli.
>
>
> I'm just not sure how I'd handle the type matching, do you know when 
> that happens in comparison to when the string/character literals would 
> be converted? would that get in the way, or get messed up?

In the clang AST, a string literal is represented as an array of 
integers of the appropriate width; the lexer converts from UTF-8 to 
UTF-16 or UTF-32 at the same time it resolves escapes.  (This is 
necessary to compute the length of the string, which is part of the 
string literal's type.)

You can check the width of the characters in a string using 
StringLiteral::getCharByteWidth().  It's 1, 2 or 4, depending on whether 
it's UTF-8, UTF-16, or UTF-32.  You can read individual characters from 
that array using StringLiteral::getCodeUnit().  Or you can grab the 
whole array using StringLiteral::getBytes() (note that the return type 
here is a bit misleading).

Actually, you might not want to use a real UTF-16 to UTF-8 conversion; 
maybe better to translate all non-ASCII bytes to 0xFF or something. Not 
that it really affects the parsing, but it probably makes translating 
back to a source location along the lines of 
StringLiteral::getLocationOfByte easier.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180823/26f32618/attachment.html>


More information about the cfe-dev mailing list