[cfe-dev] Clang's string type?
Friedman, Eli via cfe-dev
cfe-dev at lists.llvm.org
Thu Aug 23 16:06:36 PDT 2018
On 8/23/2018 3:27 PM, Marcus Johnson via cfe-dev wrote:
>
>
> Thanks for the link to that thread Tim.
>
>
> Eli: "I don't follow; can't you just convert the format string from
> UTF-16/UTF-32 to UTF-8 before checking it? (Granted, that's not
> particularly efficient, but it's rare enough that it probably doesn't
> matter.)"
>
> and I realized a bit after posting this that converting the format
> strings from UTF-16/wchar, to UTF-8 would probably be the best way
> to achieve this Eli.
>
>
> I'm just not sure how I'd handle the type matching, do you know when
> that happens in comparison to when the string/character literals would
> be converted? would that get in the way, or get messed up?
In the clang AST, a string literal is represented as an array of
integers of the appropriate width; the lexer converts from UTF-8 to
UTF-16 or UTF-32 at the same time it resolves escapes. (This is
necessary to compute the length of the string, which is part of the
string literal's type.)
You can check the width of the characters in a string using
StringLiteral::getCharByteWidth(). It's 1, 2 or 4, depending on whether
it's UTF-8, UTF-16, or UTF-32. You can read individual characters from
that array using StringLiteral::getCodeUnit(). Or you can grab the
whole array using StringLiteral::getBytes() (note that the return type
here is a bit misleading).
Actually, you might not want to use a real UTF-16 to UTF-8 conversion;
maybe better to translate all non-ASCII bytes to 0xFF or something. Not
that it really affects the parsing, but it probably makes translating
back to a source location along the lines of
StringLiteral::getLocationOfByte easier.
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180823/26f32618/attachment.html>
More information about the cfe-dev
mailing list