[PATCH] D77598: Integral template argument suffix and cast printing

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 19 15:33:39 PST 2021


rsmith added inline comments.


================
Comment at: clang/lib/AST/TemplateBase.cpp:90
+  } else if (T->isWideCharType())
+    Out << "L'" << reinterpret_cast<wchar_t*>(Val.getLimitedValue(WCHAR_MAX))
+        << "'";
----------------
`getLimitedValue` doesn't return a pointer, and the host `WCHAR_MAX` has no bearing on how target literals should be printed. Please take a look at `StmtPrinter::VisitCharacterLiteral` and see if you can factor out a helper function that can be shared between that and this. (Maybe as a static member of `CharacterLiteral`?)


================
Comment at: clang/lib/AST/TemplateBase.cpp:98
+  } else if (T->isChar16Type())
+    Out << "u'" << Val << "'";
+  else if (T->isChar32Type())
----------------
This is still printing the numeric value of the character rather than the character itself.


================
Comment at: clang/lib/AST/TemplateBase.cpp:101
+    Out << "U'"
+        << reinterpret_cast<char32_t*>(Val.getLimitedValue(UINT_LEAST32_MAX))
+        << "'";
----------------
Like in the above case, this cast doesn't make sense.


================
Comment at: clang/test/CXX/lex/lex.literal/lex.ext/p12.cpp:21
 int f = UR"("ั‚ะตัั‚ ๐€€)"_x;
-int g = UR"("ั‚ะตัั‚_๐€€)"_x; // expected-note {{in instantiation of function template specialization 'operator""_x<char32_t, 34, 1090, 1077, 1089, 1090, 95, 65536>' requested here}}
+int g = UR"("ั‚ะตัั‚_๐€€)"_x; // expected-note {{in instantiation of function template specialization 'operator""_x<char32_t, (char32_t)34, (char32_t)1090, (char32_t)1077, (char32_t)1089, (char32_t)1090, (char32_t)95, (char32_t)65536>' requested here}}
----------------
rsmith wrote:
> rsmith wrote:
> > It would be useful to generate `u8'...'`, `u16'...'`, and `u32'...'` literals at least whenever we think they'd contain printable characters.
> Note the sample output here is wrong. We should ideally print
> 
> ```
> operator""_x<char32_t, U'\"', U'ั‚', U'ะต', U'ั', U'ั‚', U'_', U'๐€€">
> ```
> 
> (with or without the `\` before the `"`), but if we don't have a good way to figure out which characters are printable, printing
> 
> ```
> operator""_x<char32_t, U'\"', U'\u0442', U'\u0435', U'\u0441', U'\u0442', U'_', U'\U00010000">
> ```
> 
> would be an acceptable fallback. You should check if LLVM already has some way to determine if a given Unicode character is printable and use it if available. I think the diagnostics infrastructure may use something like this already.
This output is still wrong.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77598/new/

https://reviews.llvm.org/D77598



More information about the cfe-commits mailing list