[cfe-dev] How do i properly address Clang's UTF-8 string literal warning?

Aleksandr Medvedev via cfe-dev cfe-dev at lists.llvm.org
Tue Nov 16 13:44:24 PST 2021


Hello, Richard and thank you for answering. In my codebase this warning
ensures compatibility of my codebase with C++20 standard (at the moment I
compile it with C++14 because of legacy part, but all new code needs to be
C++20 compatible).
I agree that the main reason is to avoid change in behavior, however the
given approaches address exactly this and there is no way to explain that
to the compiler at the same time. E.g. if I have an overloaded function
which takes std::u8string argument instead of std::string, clang doesn't
seem to have any facilities to tell if that was done to address the warning
or these are two distinct functions.

Am I correct to say that removing all u8 literals is the only portable way
to fulfill the warning requirements?

On Mon, Nov 8, 2021, 09:07 Richard Smith <richard at metafoo.co.uk> wrote:

> On Sun, 7 Nov 2021 at 07:29, Aleksandr Medvedev via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Hello, folks! (And sorry in advance if i'm writing to the wrong address
>> with my question.)
>>
>> Recently i've come across a clang warning next to this simple line of
>> code:
>>
>> auto message = u8"Текст";
>>
>> which says as follows:
>>
>>> type of UTF-8 string literal will change from array of const char to
>>> array of const char8_t in C++20
>>
>>
>> That is a warning which comes with  -W-c++20-compat flag
>> <https://clang.llvm.org/docs/DiagnosticsReference.html#wc-20-compat> and
>> the answer I'm looking for is how to properly make this warning
>> disappear (without suppressing or disabling it).
>>
>
> Why do you want this warning enabled? Depending on why you want this
> warning to appear, the answer for how to handle it will be different.
>
>
>> I tried a few approaches from the P1423R2
>> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r2.html#remediation> document,
>> including:
>>
>>    - explicit conversion function approach
>>    <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r2.html#conversion_fns>
>>    - emulation with macroses
>>    <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r2.html#emulate>
>>    - reinterpret_cast u8 literals to char
>>    <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r2.html#reinterpret_cast>
>>
>> Unfortunately neither seems to work. The warning persists probably just
>> because of the fact of having u8 literal no matter whether i have
>> overloaded functions, wrap it with macroses or try to cast it to "standard"
>> char. I wonder if i'm missing something? Or the only suitable solution
>> for this warning to vanish is to not use u8 literals in my code?
>>
>
> The diagnostic also says:
>
> <source>:1:16: note: remove 'u8' prefix to avoid a change of behavior;
> Clang encodes unprefixed narrow string literals as UTF-8
>
> ... so that's one potential option to ensure the code doesn't change
> meaning in C++20 mode, if you don't need portability to compilers that
> don't assume UTF-8.
>
> Another option is to pre-adopt this C++20 feature with -fchar8_t. That
> again is non-portable, but is available in at least Clang and -- if memory
> serves -- GCC.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211117/a8847dcc/attachment.html>


More information about the cfe-dev mailing list