<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Byte order mark (BOM) leads to diagnostic: expected "FILENAME" or <FILENAME>"
href="https://llvm.org/bugs/show_bug.cgi?id=25023">25023</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Byte order mark (BOM) leads to diagnostic: expected "FILENAME" or <FILENAME>
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>3.7
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>libclang
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>nikolai.kosjar@theqtcompany.com
</td>
</tr>
<tr>
<th>CC</th>
<td>klimek@google.com, llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=14978" name="attach_14978" title="Input files and test program using libclang.">attachment 14978</a> <a href="attachment.cgi?id=14978&action=edit" title="Input files and test program using libclang.">[details]</a></span>
Input files and test program using libclang.
...if CXTranslationUnit_PrecompiledPreamble and clang_reparseTranslationUnit()
is used. The preamble might be truncated due to that.
Consider the UTF-8 source file containing a BOM at the beginning (EF BB BF):
$ cat -v bomfile.cpp
M-oM-;M-?#include <nonExistingHeader.h>
int main() { return 0; }
...and the program making use of libclang, which can be configured to use a
preamble and set the reparse count:
$ cat libclangclient.cpp
#include <clang-c/Index.h>
#include <cstdlib>
#include <cstdio>
int main(int argc, char *argv[])
{
if (argc != 4) {
fprintf(stderr, "Usage: $0 <file> <usePreamble> <reparseCount>\n");
return 0;
}
const bool usePreamble = (argv[2][0] == '1');
int reparseCount = atoi(argv[3]);
CXIndex index = clang_createIndex(0, /*displayDiagnostics*/ 1); // ...to
compare
const unsigned options = usePreamble ?
CXTranslationUnit_PrecompiledPreamble : CXTranslationUnit_None;
CXTranslationUnit tu = clang_parseTranslationUnit(index, argv[1], 0, 0,
NULL, 0, options);
for (;reparseCount; --reparseCount)
clang_reparseTranslationUnit(tu, 0, 0,
clang_defaultReparseOptions(tu));
const unsigned diagnosticCount = clang_getNumDiagnostics(tu);
for(unsigned i = 0; i < diagnosticCount; i++) {
const CXDiagnostic diagnostic = clang_getDiagnostic(tu, i);
const CXSourceLocation location =
clang_getDiagnosticLocation(diagnostic);
unsigned line, column;
clang_getSpellingLocation(location, NULL, &line, &column, NULL);
const CXString text = clang_getDiagnosticSpelling(diagnostic);
fprintf(stderr, "-- Extracted diagnostic: %u:%u: %s\n", line, column,
clang_getCString(text));
clang_disposeString(text);
}
return 0;
}
The output of the program consists of two lines:
* The first line comes from libclang due to clang_createIndex(0,1)
* The second line is generated with the help of clang_getDiagnostic().
Consider the runs with the following configurations:
$ ./libclangclient bomfile.cpp 0 0 # OK, as expected
bomfile.cpp:1:13: fatal error: 'nonExistingHeader.h' file not found
-- Extracted diagnostic: 1:13: 'nonExistingHeader.h' file not found
$ ./libclangclient bomfile.cpp 1 0 # OK, as expected
bomfile.cpp:1:13: fatal error: 'nonExistingHeader.h' file not found
-- Extracted diagnostic: 1:13: 'nonExistingHeader.h' file not found
$ ./libclangclient bomfile.cpp 0 1 # OK, as expected
bomfile.cpp:1:13: fatal error: 'nonExistingHeader.h' file not found
-- Extracted diagnostic: 1:13: 'nonExistingHeader.h' file not found
$ ./libclangclient bomfile.cpp 1 1 # OPS
bomfile.cpp:1:13: fatal error: 'nonExistingHeader.h' file not found
-- Extracted diagnostic: 1:33: expected "FILENAME" or <FILENAME>
$ ./libclangclient bomfile.cpp 1 2 # OPS
bomfile.cpp:1:13: fatal error: 'nonExistingHeader.h' file not found
-- Extracted diagnostic: 1:33: expected "FILENAME" or <FILENAME>
Observations:
(1) As soon as the preamble option is activated and reparse count >= 1 a
misleading diagnostic will be printed. In real world source files with a
BOM much more diagnostics will be generated after that one (as if the
header file was not included...).
(2) Copy bomfile.cpp and append an extra '>' to the include line:
$ ./libclangclient bomfile_with_extra_angle.cpp 1 1
bomfile_with_extra_angle.cpp:1:34: warning: extra tokens at end of
#include directive [-Wextra-tokens]
bomfile_with_extra_angle.cpp:1:13: fatal error: 'nonExistingHeader.h'
file not found
-- Extracted diagnostic: 1:13: 'nonExistingHeader.h' file not found</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>