[llvm-bugs] [Bug 39586] New: Unicode no-break space is treated in an inconsistent way
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Nov 8 01:04:18 PST 2018
https://bugs.llvm.org/show_bug.cgi?id=39586
Bug ID: 39586
Summary: Unicode no-break space is treated in an inconsistent
way
Product: clang
Version: unspecified
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Frontend
Assignee: unassignedclangbugs at nondot.org
Reporter: vincent-llvm at vinc17.net
CC: llvm-bugs at lists.llvm.org, richard-llvm at metafoo.co.uk
As a followup to bug 39585 (which actually is a Debian packaging bug), consider
the following program:
int a;
#if FOO
#endif
int main (void)
{
return 0;
}
where the space before "int a;" and the space between "#if" and "FOO" are
no-break spaces (U+00A0).
Under Debian/unstable:
$ clang-8 tst.c
tst.c:1:1: warning: treating Unicode character as whitespace
[-Wunicode-whitespace]
int a;
^
tst.c:3:4: warning: treating Unicode character as whitespace
[-Wunicode-whitespace]
#if FOO
^
2 warnings generated.
But with the -E option:
$ clang-8 -E tst.c
tst.c:3:4: error: invalid token at start of a preprocessor expression
#if FOO
^
# 1 "tst.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 349 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "tst.c" 2
int a;
int main (void)
{
return 0;
}
1 error generated.
The first no-break space is probably treated as whitepace, like without the -E
option, but not the second one. This is not consistent.
Previous clang versions behave in the same way.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20181108/d29dc614/attachment.html>
More information about the llvm-bugs
mailing list