[PATCH] D106577: [clang] Define __STDC_ISO_10646__

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jul 23 15:48:47 PDT 2021


rsmith added a comment.

I think this patch as it stands would cause problems with libc implementations that put their `#define` somewhere other than than `stdc-predef.h` (eg, older versions of glibc that put it in `features.h` or other standard libraries that put it in `yvals.h`) due to the macro redefinition with a conflicting expansion. Such implementations are non-conforming but certainly exist. Likewise it could break the build for the same reason if we start implicitly including `stdc-predef.h` at some point. So I think that blindly predefining this macro will prove problematic in at least some circumstances.

Perhaps we can treat `__STDC_ISO_10646__` as a builtin macro, with the following properties:

- `-Wbuiltin-macro-redefined` is suppressed for it, at least in system headers -- if libc redefines it, that's fine, we'll use the libc definition.
- The builtin macro behaves as if it's not defined if `wchar_t` literals don't use UTF-32 encoding.
- When using UTF-32 encoding, the value of the macro is determined by asking the target how its libc behaves, unless we're in freestanding mode.
- For targets that use a `stdc-predef.h`, the target determines the value by preprocessing `stdc-predef.h` and inspecting what value it sets the macro to. Other targets can do whatever makes sense for them, such as sniffing the libc version and defining the macro based on that, or just hardcoding the right value.

(And we can do similar things for other `__STDC_*` macros that get defined by `stdc-predef.h` by some standard libraries.)

This has some nice implications: we don't ever look at `stdc-predef.h` unless a libc-dependent macro is actually inspected, don't have conflicts between our macro and one defined by libc headers, can still produce a proper environment in freestanding mode, can still define the macro when using a libc version that doesn't happen to define the macro itself at all, and so on. Also, if the macro value ever needs to take compiler support into account (for example, if a Python-style `L"\N{SNOWMAN}"` syntax is added and the macro is supposed to indicate which character names are available too), we can easily support that. But it would be a little complex, and the macro wouldn't appear in `-dM -E` unless we added some special-casing for it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106577/new/

https://reviews.llvm.org/D106577



More information about the cfe-commits mailing list