[PATCH] Frontend: Fix SourceColumnMap assertion failure on non-ascii characters.
Logan Chien
tzuhsiang.chien at gmail.com
Mon Dec 29 16:11:47 PST 2014
Ping?
On Thu, Dec 25, 2014 at 12:54 AM, Logan Chien <tzuhsiang.chien at gmail.com>
wrote:
> Hi Richard,
>
> I found that some assertion failure will be raised from the
> lib/Frontend/TextDiagnostic.cpp since the assertion incorrectly compares
> the byte index with the number of columns, which is a problem when the
> input source code contains non-ascii characters (multiple bytes per
> column.) The patch should fix the problem, please have a look. Thanks.
>
> http://reviews.llvm.org/D6746
>
> Sincerely,
> Logan
>
>
> On Sat, Dec 20, 2014 at 5:17 PM, Logan Chien <tzuhsiang.chien at gmail.com>
> wrote:
>
>> Hi rsmith,
>>
>> http://reviews.llvm.org/D6746
>>
>> Files:
>> lib/Frontend/TextDiagnostic.cpp
>> test/Frontend/source-col-map.c
>>
>> Index: lib/Frontend/TextDiagnostic.cpp
>> ===================================================================
>> --- lib/Frontend/TextDiagnostic.cpp
>> +++ lib/Frontend/TextDiagnostic.cpp
>> @@ -293,14 +293,14 @@
>>
>> /// \brief Map from a byte index to the next byte which starts a
>> column.
>> int startOfNextColumn(int N) const {
>> - assert(0 <= N && N < static_cast<int>(m_columnToByte.size() - 1));
>> + assert(0 <= N && N < static_cast<int>(m_byteToColumn.size() - 1));
>> while (byteToColumn(++N) == -1) {}
>> return N;
>> }
>>
>> /// \brief Map from a byte index to the previous byte which starts a
>> column.
>> int startOfPreviousColumn(int N) const {
>> - assert(0 < N && N < static_cast<int>(m_columnToByte.size()));
>> + assert(0 < N && N < static_cast<int>(m_byteToColumn.size()));
>> while (byteToColumn(--N) == -1) {}
>> return N;
>> }
>> @@ -323,9 +323,10 @@
>> std::string
>> &FixItInsertionLine,
>> unsigned Columns,
>> const SourceColumnMap &map) {
>> - unsigned MaxColumns = std::max<unsigned>(map.columns(),
>> - std::max(CaretLine.size(),
>> -
>> FixItInsertionLine.size()));
>> + unsigned CaretColumns = CaretLine.size();
>> + unsigned FixItColumns =
>> llvm::sys::locale::columnWidth(FixItInsertionLine);
>> + unsigned MaxColumns = std::max(static_cast<unsigned>(map.columns()),
>> + std::max(CaretColumns, FixItColumns));
>> // if the number of columns is less than the desired number we're done
>> if (MaxColumns <= Columns)
>> return;
>> @@ -1110,12 +1111,13 @@
>> // Copy the line of code into an std::string for ease of manipulation.
>> std::string SourceLine(LineStart, LineEnd);
>>
>> - // Create a line for the caret that is filled with spaces that is the
>> same
>> - // length as the line of source code.
>> - std::string CaretLine(LineEnd-LineStart, ' ');
>> -
>> + // Build the byte to column map.
>> const SourceColumnMap sourceColMap(SourceLine, DiagOpts->TabStop);
>>
>> + // Create a line for the caret that is filled with spaces that is the
>> same
>> + // number of columns as the line of source code.
>> + std::string CaretLine(sourceColMap.columns(), ' ');
>> +
>> // Highlight all of the characters covered by Ranges with ~ characters.
>> for (SmallVectorImpl<CharSourceRange>::iterator I = Ranges.begin(),
>> E = Ranges.end();
>> Index: test/Frontend/source-col-map.c
>> ===================================================================
>> --- /dev/null
>> +++ test/Frontend/source-col-map.c
>> @@ -0,0 +1,41 @@
>> +// RUN: not %clang_cc1 %s -fsyntax-only -fmessage-length 75 -o /dev/null
>> 2>&1 | FileCheck %s -strict-whitespace
>> +
>> +// Test case for the text diagnostics source column conversion crash.
>> +
>> +// There are two problems to be tested:
>> +//
>> +// 1. The assertion in startOfNextColumn() and startOfPreviousColumn()
>> +// should not be raised when the byte index is greater than the
>> column index
>> +// since the non-ascii characters may use more than one bytes to
>> store a
>> +// character in a column.
>> +//
>> +// 2. The length of the caret line should be equal to the number of
>> columns of
>> +// source line, instead of the length of the source line. Otherwise,
>> the
>> +// assertion in selectInterestingSourceRegion will be raised because
>> the
>> +// removed columns plus the kept columns are not greater than the max
>> column,
>> +// which means that we should not remove any column at all.
>> +
>> +// NOTE: This file is encoded in UTF-8 and intentionally contains some
>> +// non-ASCII characters.
>> +
>> +__attribute__((format(printf, 1, 2)))
>> +extern int printf(const char *fmt, ...);
>> +
>> +void test1(Unknown* b); // αααα αααα αααα αααα αααα αααα αααα αααα αααα
>> αααα αααα
>> +// CHECK: unknown type name 'Unknown'
>> +// CHECK-NEXT: void test1(Unknown* b); // αααα αααα αααα αααα αααα αααα
>> αααα ααα...
>> +// CHECK-NEXT: {{^ \^$}}
>> +
>> +void test2(Unknown* b); // αααα αααα αααα αααα αααα αααα αααα αααα αααα
>> +
>> +// CHECK: unknown type name 'Unknown'
>> +// CHECK-NEXT: void test2(Unknown* b); // αααα αααα αααα αααα αααα αααα
>> αααα αααα αααα
>> +// CHECK-NEXT: {{^ \^$}}
>> +
>> +void test3() {
>> + /* αααα αααα αααα αααα αααα αααα αααα αααα αααα αααα */ printf("%d",
>> "s");
>> +}
>> +// CHECK: format specifies type 'int' but the argument has type
>> 'char *'
>> +// CHECK-NEXT: ...αααα αααα αααα αααα αααα αααα αααα αααα αααα */
>> printf("%d", "s");
>> +// CHECK-NEXT: {{^
>> ~~ \^~~$}}
>> +// CHECK-NEXT: {{^
>> %s$}}
>>
>> EMAIL PREFERENCES
>> http://reviews.llvm.org/settings/panel/emailpreferences/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20141230/96165a36/attachment.html>
More information about the cfe-commits
mailing list