<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Missing definition for __STDC_ISO_10646__"
   href="https://bugs.llvm.org/show_bug.cgi?id=45613">45613</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Missing definition for __STDC_ISO_10646__
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>C
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>joe@begriffs.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
          </td>
        </tr></table>
      <p>
        <div>
        <pre>The C99 standard says that the environment should define symbol
__STDC_ISO_10646__ when wchar_t can hold all Unicode codepoints. The specific
language is:

Section 6.10.8, Predefined macro names:
<span class="quote">> __STDC_ISO_10646__
> An integer constant of the form yyyymmL (for example, 199712L).
> If this symbol is defined, then every character in the Unicode
> required set, when stored in an object of type wchar_t, has the
> same value as the short identifier of that character. The
> Unicode required set consists of all the characters that are
> defined by ISO/IEC 10646, along with all amendments and
> technical corrigenda, as of the specified year and month.</span >

Clang on macOS and OpenBSD does not define this symbol, while clang on Ubuntu
does.

I created a test program at <a href="https://github.com/begriffs/wchar-conformance">https://github.com/begriffs/wchar-conformance</a> which
sends UTF-8 encodings of all codepoints through mbstowcs() and checks whether
the wchar_t* holds the expected values. On all three platforms (macOS, OpenBSD,
Ubuntu) the test program was successful. Unless I misunderstood the standard,
clang on macOS and OpenBSD should define the symbol, but they haven't.

Version specifics:

----------------------------------------------------
Apple LLVM version 10.0.0 (clang-1000.11.45.5)
Target: x86_64-apple-darwin17.7.0
----------------------------------------------------
OpenBSD clang version 8.0.1 (tags/RELEASE_801/final)
                            (based on LLVM 8.0.1)
Target: amd64-unknown-openbsd6.6
----------------------------------------------------
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
Target: x86_64-pc-linux-gnu
(__STDC_ISO_10646__ is 201706)</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>