[PATCH] D155610: [Clang][Sema] Fix display of characters on static assertion failure

Corentin Jabot via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 6 15:00:50 PDT 2023


cor3ntin added inline comments.


================
Comment at: clang/test/SemaCXX/static-assert-cxx26.cpp:304
+static_assert('\u{9}' == (char)1, ""); // expected-error {{failed}} \
+                                       // expected-note {{evaluates to ''\t' (0x09, 9) == '<U+0001>' (0x01, 1)'}}
+static_assert((char8_t)-128 == (char8_t)-123, ""); // expected-error {{failed}} \
----------------
tahonermann wrote:
> cor3ntin wrote:
> > tahonermann wrote:
> > > Is the expected note up to date? I don't see code that would generate the `<U+0001>` output. Am I just missing it? Since U+0001 is a valid, though non-printable, character, I would expect more `'\u0001'`.
> > See elsewhere in the discussion. this formating is pre existing and managed at the DiagnosticEngine level (pushEscapedString). the reason it's not `\u0001` is 1/ to avoid  reusing c++ syntactic elements for something that comes from diagnostics and is not represented as an escaped sequence in source 2/ `\u00011` is unreadable, and `\U000000001` is also not helpful :)
> > 
> Thanks for the explanation. I'm not sure that I agree with the rationale for (1) though. We're already putting the value in single quotes and representing some values with escapes in many of these cases when the value isn't produced by an escape sequence (or even a character/string literal); why exclude `\uXXXX`? I agree with the rationale for (2); we could use `'\u{1}'` in that case.
FYI afaik the notation in clang predates the existence of \u{} by a few years, and follow Unicode notation (https://unicode.org/mail-arch/unicode-ml/y2005-m11/0060.html).
Oldest instance seems to be https://github.com/llvm/llvm-project/commit/77091b167fd959e1ee0c4dad4ec44de43b6c95db - i followed suite when reworking the generic escaping mechanism all string fed to diagnostics go through.

I don't care about changing the syntax, but i do hope we are consistent. Ultimately what we are trying to do is to designate a unicode codepoint and whether we do it through C++ syntax or not probably does not matter much as long as it's clear, delimited and consistent!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155610/new/

https://reviews.llvm.org/D155610



More information about the cfe-commits mailing list