[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

Richard Smith richard at metafoo.co.uk
Mon Jan 12 18:01:38 PST 2015


On Mon, Jan 12, 2015 at 1:26 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk
> wrote:

> On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com> wrote:
> >
> >
> > On 15 Oct 2014, at 19:42, Richard Smith <richard at metafoo.co.uk> wrote:
> >> On 15 Oct 2014 05:12, "Ed Schouten" <ed at 80386.nl> wrote:
> > ...
> >> The test case in the LLVM tree is invalid and should be discarded. It
> >> erroneously assumes that the encoding of wchar_t is independent of the
> >> locale.
> >>
> > That makes no sense. These value are compile-time constants and cannot
> possibly depend on the locale.
>
> I believe Richard is wrong here.  There are a number of similar
> compile-time constant macros in the C spec.  I believe the clue as to the
> correct reading of the spec is in the name of the macro:
> __STDC_MB_MIGHT_NEQ_WC__
>
> Note the word *might*.  It means that it is not safe for code to assume
> that a cast will give the corresponding char value if one exists.  i.e.
> that assumption is not true for all locales.
>

You should read the definitions in the relevant standards rather than
trying to guess what the macro means from its name. Here is the definition:

"The integer constant 1, intended to indicate that, in the encoding for
wchar_t, a member of the basic character set need not have a code value
equal to its value when used as the lone character in an integer character
constant."

So the value 1 indicates that 'x' might not equal L'x' for some character x
in the basic character set. (Note that the 'might' means that there might
exist some *character* where this happens, not that there might exist some
*locale* where this happens.) Since the value of 'x' and L'x' are
determined at translation time, this property obviously cannot depend in
any way on the current locale in the execution environment.

Note that the above property is *exactly* what the test is testing for.

However... the FreeBSD folks don't seem interested in fixing their bug, and
it's technically conforming for an implementation to define this macro to 1
in any situation -- a member of the basic source character set "need not"
have the same value as a narrow or wide character, even though they all
actually do -- making this a quality-of-implementation issue, and I'm tired
of discussing this, so I've relaxed the test for FreeBSD in r225751.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150112/9ef79337/attachment.html>


More information about the llvm-dev mailing list