[PATCH] D155610: [Clang][Sema] Fix display of characters on static assertion failure
Corentin Jabot via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Aug 14 05:56:03 PDT 2023
cor3ntin added inline comments.
================
Comment at: clang/lib/Sema/SemaDeclCXX.cpp:16869
+/// The code point needs to be zero-extended to 32-bits.
+void ConvertCharToString(uint32_t CodePoint, const BuiltinType *BTy,
+ unsigned TyWidth, llvm::raw_ostream &OS) {
----------------
hubert.reinterpretcast wrote:
> It does not seem that the first parameter expects a `CodePoint` argument in all cases. For `Char_S`, `Char_U`, and `Char8`, it seems the function wants to treat the input as a UTF-8 code unit.
>
> I suggest changing the argument to be clearly a code unit (and potentially treat it as a code point value as appropriate later in the function).
>
> Also: The function should probably be declared as having static linkage.
> Additionally: The function does not "convert" in the language semantic sense. `WriteCharacterValueDescriptionForDisplay` might be a better name.
Agreed, `CodeUnit` or `Value` would be more correct (mostly because of numeric escape sequences).
But if we are going to change that then `WriteCharValueForDiagnostic` would be better, `Character` implies too much
================
Comment at: clang/lib/Sema/SemaDeclCXX.cpp:16876
+ // other types.
+ if (CodePoint <= UCHAR_MAX) {
+ StringRef Escaped = escapeCStyle<EscapeChar::Single>(CodePoint);
----------------
hubert.reinterpretcast wrote:
> For types other than `Char_S`, `Char_U`, and `Char8`, this fails to treat the C1 Controls and Latin-1 Supplement characters as Unicode code points. It looks like test coverage for these cases are missing.
`escapeCStyle` is one of the things that assume ASCII / UTF, but yes, we might as well reduce to 0x7F just to avoid unnecessary work
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D155610/new/
https://reviews.llvm.org/D155610
More information about the cfe-commits
mailing list