[PATCH] D106577: [clang] Define __STDC_ISO_10646__

ThePhD via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Sep 4 10:54:22 PDT 2021


ThePhD added a comment.

Hi, my name is JeanHeyd Meneide. I'm the Project Editor for C, but more importantly I'm the author of this paper: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2728.htm

This paper was accepted yesterday (September 3rd, 2021) into the C Standard, and (after I merge it and the like ~25 other papers + Annex X I need to merge), will appear in the next Draft of the C Standard.

As the paper's introduction and movtiation notes, the interpretation above that the locale-dependent encoding of `wchar_t` strings and `char` (MBS) strings for runtime functions like `mbstowcs` and `wcstombs` was not only a little bit silly, but also impossible to enforce properly on most systems without severe undue burden.

The wording of the paper explicitly removes the tie-in of the encoding of string literals and wide string literals to the library functions and instead makes them implementation-defined. This has no behavior change on any platform (it is, in a very strict sense, an expansion of the current definition and a standardization of existing practice amongst all implementations). What it does mean is that, however, Clang and every other compiler - so long as they pick a ISO10646-code point capable encoding for their `wchar_t` literals - can define this preprocessor macro unconditionally. My understanding is that on most systems where things have not been patched / tweaked, this applies since Clang vastly prefers UTF-32 in most of its setups.

It is my strong recommendation this patch be accepted and made unconditional, both in anticipation of the upcoming standard and the widespread existing practice.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106577/new/

https://reviews.llvm.org/D106577



More information about the cfe-commits mailing list