[PATCH] D124996: [clang][preprocessor] Fix unsigned-ness of utf8 char literals
Tom Honermann via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu May 5 07:33:55 PDT 2022
tahonermann requested changes to this revision.
tahonermann added a comment.
This revision now requires changes to proceed.
I think changes are needed to make this behavior dependent on whether `char8_t` support is active or not.
================
Comment at: clang/lib/Lex/PPExpressions.cpp:413
Val.setIsUnsigned(!TargetInfo::isTypeSigned(TI.getWCharType()));
- else if (!Literal.isUTF16() && !Literal.isUTF32())
+ else if (!Literal.isUTF8() && !Literal.isUTF16() && !Literal.isUTF32())
Val.setIsUnsigned(!PP.getLangOpts().CharIsSigned);
----------------
I think the check for UTF-8 should also be conditioned on `PP.getLangOpts().Char8`. When `char8_t` support is not enabled (as in C++17 or with `-fno-char8_t` in C++20), UTF-8 character literals still have type `char`.
else if (!(Literal.isUTF8() && PP.getLangOpts().Char8) && !Literal.isUTF16() && !Literal.isUTF32())
================
Comment at: clang/test/Lexer/utf8-char-literal.cpp:4
// RUN: %clang_cc1 -triple x86_64-apple-darwin -std=c2x -x c -fsyntax-only -verify %s
// RUN: %clang_cc1 -triple x86_64-apple-darwin -std=c++1z -fsyntax-only -verify %s
----------------
I think we should drop testing for `-std=c++1z` and add testing of `-std=c++17` and `-std=c++20`. Ideally, the test would then validate the differences in behavior.
================
Comment at: clang/test/Lexer/utf8-char-literal.cpp:31-34
+#if __cplusplus > 201402L || __STDC_VERSION__ >= 202000L
+#if u8'\xff' != 0xff
+#error u8 char literal is not unsigned
+#endif
----------------
Prior to C++20 (unless `-fchar8_t` is passed), `u8'\xff'` should have the same behavior as `'\xff'`.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124996/new/
https://reviews.llvm.org/D124996
More information about the cfe-commits
mailing list