[cfe-dev] [libcxx] RFC: Bringing sanity to platform specific <locale>
Craig, Ben via cfe-dev
cfe-dev at lists.llvm.org
Wed Feb 3 07:39:24 PST 2016
The locale feature currently handles platform specific code in a very
inconsistent and difficult to update manner. I think there are even
conformance issues buried in there. I'll start with my suggested fix,
then talk about some of the problems that are in the current code base.
Suggested fix:
In the generic parts of the code base, don't call locale functions from
outside the C and C++ standards directly. That means no calls to POSIX
only functions, no calls to the *_l functions, and no calls to
newlocale. Instead, call into functions from the support headers.
These functions will all have __* names. For example, locale.cpp and
<locale> would no longer call btowc_l or wcsnrtombs_l directly, but
would instead call __btowc_l and __wcsnrtombs_l. On platforms that
natively have the _l functions, the associated support headers would
have a static inline forwarding function.
To avoid duplication of effort, common implementations of the various
functions will be provided in helper headers where it makes sense. I
expect there to be a helper header for the __*_l to *_l wrapper
functions, a helper header for the __*_l functions that ignore the
locale, and a helper header for no-op versions of [new|free|use|dup]locale.
Current Problem Case 1: OS header abstractions
I'll start with the least nasty case. There are several "support"
headers that provide OS specific implementations. For example,
support/musl/xlocale.h and support/win32/locale_win32.h.
In some systems, strtoull_l is provided by the C library or OS. On
others, the support headers provide an implementation. Plenty of other
symbols follow the pattern of strtoull_l (like isupper_l).
I like this approach in general, except that I don't like using the name
strtoull_l on platforms were the C library and OS don't provide that
symbol. I'm not even sure it's conforming, as it doesn't follow the
__lower or _Capital naming scheme that is reserved for the
implementation. A different program (or library) could try to take the
name strtoull_l, and I think that is supposed to work. I know I've
certainly had issues in the past when multiple libraries try to fill in
the gaps of missing OS / library support, and then those gap fillers end
up conflicting.
Current Problem Case 2: Nearly recursive abstractions
* wchar_t ctype_byname<wchar_t>::do_widen(char c) const
This function will forward to btowc_l if we think the OS supports it.
If the OS doesn't have btowc_l, we forward to a libcxx defined
__btowc_l. In isolation, this is fine.
* wint_t __btowc_l(int __c, locale_t __l)
This function will forward to btowc_l if we think the OS supports it.
If the OS doesn't have btowc_l, we set the locale, call btowc, then
restore the locale. In isolation, this is also fine.
In combination, we have a strange duplication of effort. We also have a
confusion situation for anyone that wants to do better than btowc, but
doesn't have __btowc_l.
There are a few other symbols that follow this pattern. (wctob_l,
wcsnrtombs_l, wcrtomb_l, and plenty of others)
Current Problem Case 3: Hidden abstractions
Windows and Sun includes provide libcxx versions of wcsnrtombs_l in
their respective support directories / headers. But libcxx doesn't call
them, because __wcsnrtombs_l forwards to wcsnrtombs.
Do others see value in fixing these problems? Is this general approach
reasonable?
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
More information about the cfe-dev
mailing list