[PATCH] D34158: For Linux/gnu compatibility, preinclude <stdc-predef.h> if the file is available
James Y Knight via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Aug 8 15:41:56 PDT 2017
jyknight added a comment.
In https://reviews.llvm.org/D34158#827178, @joerg wrote:
> I had a long discussion with James about this on IRC without reaching a clear consensus. I consider forcing this behavior on all targets to be a major bug. It should be opt-in and opt-in only:
>
> (1) The header name is not mandated by any standard. It is not in any namespace generally accepted as implementation-owned.
This is a point. I didn't think it was a big deal, but if you want to argue a different name should be used, that's a reasonable argument. If we can get some agreement amongst other libc vendors to use some more agreeable alternative name, and keep a fallback on linux-only for the "stdc-predef.h" name, I'd consider that as a great success.
> (2) It adds magic behavior that can make debugging more difficult. Partially preprocessed sources for example could be compiled with plain -c before, now they need a different command line.
If this is a problem, making it be Linux-only does _nothing_ to solve it. But I don't actually see how this is a substantively new problem? Compiling with plain -c before would get #defines for those predefined macros that the compiler sets, even though you may not have wanted those. Is this fundamentally different?
> (3) It seems to me that the GNU userland (and maybe Windows) is the exception to a well integrated tool chain. Most other platforms have a single canonical libc, libm and libpthread implementation and can as such directly define all the relevant macros directly in the driver.
I don't think this is accurate. There's many platforms out there, and for almost none of them do we have exact knowledge of the features of the libc encoded into the compiler. I'd note that not only do you need this for every (OS, libc) combination, you'd need it for every (OS, libc-VERSION) combination.
> Given that many of the macros involved are already reflected by the compiler behavior anyway, they can't be decoupled. I.e. the questionable concept of locale-independent wchar_t is already hard-coded in the front end as soon as any long character literals are used.
AFAICT, this example is not actually the case. The frontend only needs to know *a* valid encoding for wide-character literals in some implementation-defined locale (for example, it might always emit them as unicode codepoints, as clang does). Standard says: "the array elements [...] are initialized with the sequence of wide characters corresponding to the multibyte character sequence, as defined by the mbstowcs function with an implementation defined current locale."
On the other hand, I believe the intent of this macro is to guarantee that _no matter what_ the locale is, that a U+0100 character (say, translated with mbrtowc from the locale encoding) gets represented as 0x100.
Thus, it's "fine" for the frontend to always emit wchar_t literals as unicode, yet, the libc may sometimes use an arbitrary different internal encoding, depending on the locale used at runtime. (Obviously using wide character literals with such a libc would be a poor user experience, and such a libc probably ought to reconsider its choices, but that's a different discussion.)
> As such, please move the command line additions back into the target-specific files for targets that actually want to get this behavior.
Without even a suggestion of a better solution to use for other targets, I find it is to be a real shame to push for this to be linux-only, and leave everyone else hanging. I find that that _most_ of these defines are effectively library decisions. I further would claim that these are likely to vary over the lifetime of even a single libc, and that as such we would be better served by allowing the libc to set them directly, rather than encoding it into the compiler.
TTBOMK, no targets except linux/glibc/gcc actually comply with this part of the C99/C11 standard today, and so this feature would be useful to have available across all targets.
(I very much dislike that the C standard has started adding all these new predefined macros, instead of exposing them from a header, but there's not much to be done about that...)
Going through the various macros:
`__STDC_ISO_10646__`:
As discussed above, this is effectively entirely up to the libc. The compiler only need support one possible encoding for wchar_t, and clang always chooses unicode. But it can't define this because the libc might use others as well.
`__STDC_MB_MIGHT_NEQ_WC__`:
As with `__STDC_ISO_10646__`, this is up to the libc not the compiler. (At least, I think so... I note that clang currently sets this for freebsd, with a FIXME next to it saying it's only intended to apply to wide character literals. I don't see that the standard says that, however, so, IMO, having it set for freebsd was and is correct).
`__STDC_UTF16__`, `__STDC_UTF32__`:
Again, analogous to `__STDC_ISO_10646__`, except dealing with char16_t/char32_t. this should be set if the libc guarantees that mbrtoc16/mbrtoc32 generate utf16/32-encoded data.
`__STDC_ANALYZABLE__`
Possibly(?) entirely on the compiler side. (I'm going to bet that nobody will ever implement this, though.)
`__STDC_IEC_559__`, `__STDC_IEC_559_COMPLEX__`:
Requires cooperation between library and compiler; some parts of compliance is in the compiler, and some is in the libm routines. GCC defines `__GCC_IEC_559` to indicate when the compiler intends to comply, and glibc then sets `__STDC_IEC_559__` depending on the GCC define, to indicate the system as a whole is intended to comply.
`__STDC_LIB_EXT1__`:
Entirely a library issue, nothing to do with the compiler.
`__STDC_NO_COMPLEX__`:
Given that clang supports complex, what remains is: does libc provides complex.h and the functions within?
`__STDC_NO_THREADS__`:
Given that clang supports _Atomic (and provides stdatomic.h), what remains is: does libc provide thread.h and the functions within?
`__STDC_NO_VLA__`:
Entirely on the compiler, nothing to do with libc.
https://reviews.llvm.org/D34158
More information about the cfe-commits
mailing list