[llvm-branch-commits] [clang] [llvm] Enable fexec-charset option (PR #138895)

via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Mon May 12 00:42:43 PDT 2025


================
@@ -1842,23 +1859,52 @@ CharLiteralParser::CharLiteralParser(const char *begin, const char *end,
             HadError = true;
             PP.Diag(Loc, diag::err_character_too_large);
           }
+          if (!HadError && Converter) {
+            assert(Kind != tok::wide_char_constant &&
+                   "Wide character translation not supported");
+            char ByteChar = *tmp_out_start;
+            SmallString<1> ConvertedChar;
+            Converter->convert(StringRef(&ByteChar, 1), ConvertedChar);
----------------
cor3ntin wrote:

Here the order of operation should be:
  -> convert from UTF-8 to UTF-32, check it's a valid character
  -> convert the same buffer from UTF-8 to the literal encoding
   -> Check that that succeed and has a size of one

(ie some codepoints might hgave a size of 2 when encoded as utf-8 but 1 when encoded as latin1)

https://github.com/llvm/llvm-project/pull/138895


More information about the llvm-branch-commits mailing list