[PATCH] D124996: [clang][preprocessor] Fix unsigned-ness of utf8 char literals

Tom Honermann via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu May 5 07:33:55 PDT 2022


tahonermann requested changes to this revision.
tahonermann added a comment.
This revision now requires changes to proceed.

I think changes are needed to make this behavior dependent on whether `char8_t` support is active or not.



================
Comment at: clang/lib/Lex/PPExpressions.cpp:413
       Val.setIsUnsigned(!TargetInfo::isTypeSigned(TI.getWCharType()));
-    else if (!Literal.isUTF16() && !Literal.isUTF32())
+    else if (!Literal.isUTF8() && !Literal.isUTF16() && !Literal.isUTF32())
       Val.setIsUnsigned(!PP.getLangOpts().CharIsSigned);
----------------
I think the check for UTF-8 should also be conditioned on `PP.getLangOpts().Char8`. When `char8_t` support is not enabled (as in C++17 or with `-fno-char8_t` in C++20), UTF-8 character literals still have type `char`.
      else if (!(Literal.isUTF8() && PP.getLangOpts().Char8) && !Literal.isUTF16() && !Literal.isUTF32())


================
Comment at: clang/test/Lexer/utf8-char-literal.cpp:4
 // RUN: %clang_cc1 -triple x86_64-apple-darwin -std=c2x -x c -fsyntax-only -verify %s
 // RUN: %clang_cc1 -triple x86_64-apple-darwin -std=c++1z -fsyntax-only -verify %s
 
----------------
I think we should drop testing for `-std=c++1z` and add testing of `-std=c++17` and `-std=c++20`. Ideally, the test would then validate the differences in behavior.


================
Comment at: clang/test/Lexer/utf8-char-literal.cpp:31-34
+#if __cplusplus > 201402L || __STDC_VERSION__ >= 202000L
+#if u8'\xff' != 0xff
+#error u8 char literal is not unsigned
+#endif
----------------
Prior to C++20 (unless `-fchar8_t` is passed), `u8'\xff'` should have the same behavior as `'\xff'`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124996/new/

https://reviews.llvm.org/D124996



More information about the cfe-commits mailing list