<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jan 12, 2015 at 1:26 AM, David Chisnall <span dir="ltr"><<a href="mailto:David.Chisnall@cl.cam.ac.uk" target="_blank">David.Chisnall@cl.cam.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class="">On 12 Jan 2015, at 08:07, Dimitry Andric <<a href="mailto:dimitry@andric.com">dimitry@andric.com</a>> wrote:<br>

><br>

><br>

> On 15 Oct 2014, at 19:42, Richard Smith <<a href="mailto:richard@metafoo.co.uk">richard@metafoo.co.uk</a>> wrote:<br>

>> On 15 Oct 2014 05:12, "Ed Schouten" <<a href="mailto:ed@80386.nl">ed@80386.nl</a>> wrote:<br>

> ...<br>

>> The test case in the LLVM tree is invalid and should be discarded. It<br>

>> erroneously assumes that the encoding of wchar_t is independent of the<br>

>> locale.<br>

>><br>

> That makes no sense. These value are compile-time constants and cannot possibly depend on the locale.<br>

<br>

</span>I believe Richard is wrong here.  There are a number of similar compile-time constant macros in the C spec.  I believe the clue as to the correct reading of the spec is in the name of the macro: __STDC_MB_MIGHT_NEQ_WC__<br>

<br>

Note the word *might*.  It means that it is not safe for code to assume that a cast will give the corresponding char value if one exists.  i.e. that assumption is not true for all locales.<br></blockquote><div><br></div><div>You should read the definitions in the relevant standards rather than trying to guess what the macro means from its name. Here is the definition:</div><div><br></div><div>"The integer constant 1, intended to indicate that, in the encoding for wchar_t, a member of the basic character set need not have a code value equal to its value when used as the lone character in an integer character constant."</div><div><br></div><div>So the value 1 indicates that 'x' might not equal L'x' for some character x in the basic character set. (Note that the 'might' means that there might exist some *character* where this happens, not that there might exist some *locale* where this happens.) Since the value of 'x' and L'x' are determined at translation time, this property obviously cannot depend in any way on the current locale in the execution environment.</div><div><br></div><div>Note that the above property is *exactly* what the test is testing for.</div><div><br></div><div>However... the FreeBSD folks don't seem interested in fixing their bug, and it's technically conforming for an implementation to define this macro to 1 in any situation -- a member of the basic source character set "need not" have the same value as a narrow or wide character, even though they all actually do -- making this a quality-of-implementation issue, and I'm tired of discussing this, so I've relaxed the test for FreeBSD in r225751.</div></div></div></div>