<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Unicode no-break space is treated in an inconsistent way"
   href="https://bugs.llvm.org/show_bug.cgi?id=39586">39586</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Unicode no-break space is treated in an inconsistent way
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Frontend
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>vincent-llvm@vinc17.net
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
          </td>
        </tr></table>
      <p>
        <div>
        <pre>As a followup to <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - .../llvm-7/lib/clang/7.0.1/include/limits.h is invalid: no-break space character"
   href="show_bug.cgi?id=39585">bug 39585</a> (which actually is a Debian packaging bug), consider
the following program:

 int a;

#if FOO
#endif

int main (void)
{
  return 0;
}

where the space before "int a;" and the space between "#if" and "FOO" are
no-break spaces (U+00A0).

Under Debian/unstable:

$ clang-8 tst.c
tst.c:1:1: warning: treating Unicode character as whitespace
      [-Wunicode-whitespace]
 int a;
^
tst.c:3:4: warning: treating Unicode character as whitespace
      [-Wunicode-whitespace]
#if FOO
   ^
2 warnings generated.

But with the -E option:

$ clang-8 -E tst.c
tst.c:3:4: error: invalid token at start of a preprocessor expression
#if FOO
   ^
# 1 "tst.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 349 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "tst.c" 2
 int a;




int main (void)
{
  return 0;
}
1 error generated.

The first no-break space is probably treated as whitepace, like without the -E
option, but not the second one. This is not consistent.

Previous clang versions behave in the same way.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>