[cfe-commits] r148389 - in /cfe/trunk: include/clang/Basic/DiagnosticLexKinds.td lib/Lex/LiteralSupport.cpp

Seth Cantrell seth.cantrell at gmail.com
Sun Mar 25 01:57:29 PDT 2012

On Mar 24, 2012, at 10:11 PM, Eli Friedman wrote:

> On Sat, Mar 24, 2012 at 5:21 PM, Seth Cantrell <seth.cantrell at gmail.com> wrote:
>> Okay, here's a different approach. I'm trying to treat the column numbers as being a code unit offset and using that to calculate the real column numbers. It relies on being able to compute the column width of characters, and this hacked up version uses wcswidth to do that even though that's not really going to work (since it's locale based an also because wcswidth seems to be broken). Anyway, here are what the results look like for me in Terminal:
> The results seem roughly correct... this patch doesn't look right,
> though.  Primarily, I really don't like relying on column numbers: the
> source buffer is in bytes, and we want to print the output in columns
> as measured on the terminal.  The column number is neither of those
> (or at least, it isn't supposed to be), so using it here just seems to
> make everything more complicated.

It seems like internally and in APIs we'd generally want 'column' numbers to be byte offsets and leave it to clients to use that to compute what they need, be it character indices or column number or whatever. This seems to be the current behavior and that's not surprising, since originally only ASCII was actually supported. For example SM.getColumnNumber does appear to do this (though one indexed rather than zero indexed). My changes only touch the parts where the column numbers were clearly used as byte offsets. E.g. I replace the previous code

-  if (ColNo-1 < CaretLine.size())
-    CaretLine[ColNo-1] = '^';
-  else
-    CaretLine.push_back('^');

With the following, using a new helper function byteOffsetToConsoleColumnNumber()

+  ColNo = byteOffsetToConsoleColumnNumber(SourceLine,ColNo-1);
+  if (CaretLine.size()<ColNo+1)
+    CaretLine.resize(ColNo+1,' ');
+  CaretLine[ColNo] = '^';


-  for (unsigned i = StartColNo; i < EndColNo; ++i)
-    CaretLine[i] = '~';


+  StartColNo = byteOffsetToConsoleColumnNumber(SourceLine,StartColNo);
+  EndColNo = byteOffsetToConsoleColumnNumber(SourceLine,EndColNo);
+  assert(StartColNo<=EndColNo);
+  if (CaretLine.size() < EndColNo+1)
+    CaretLine.resize(EndColNo+1,' ');
+  std::fill(CaretLine.begin()+StartColNo,CaretLine.begin()+EndColNo,'~');

Is there any better way than to rely on these column numbers? My plan is currently to treat them as byte offsets and to use a centralized method for converting them to Terminal column numbers when printing out diagnostics. There are still more places where I'd need to do that conversion, like selectInterestingSourceRegion().

> On a side note, the handling for source ranges looks it doesn't handle
> multiple source ranges correctly.

Okay, I've corrected that now. I also rolled the tab expansion into the method I'm using for creating 'printable' representations of source, and I think everything should be able to use this to compute correct column numbers now. (except that I still need to find a decent replacement for wcswidth.)

> -Eli
>> On Mar 23, 2012, at 4:42 PM, Eli Friedman wrote:
>>> On Fri, Mar 23, 2012 at 1:49 AM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>> On Thu, Mar 22, 2012 at 11:57 PM, Seth Cantrell <seth.cantrell at gmail.com> wrote:
>>>>> Here are some patches for review. One enables setting reverse colors with raw_ostreams in LLVM. The other is to print unprintable characters in clang diagnostics in hex with reversed colors.
>>>> The patch for color reversal seems fine, although maybe someone more
>>>> familiar with that code should take a look.  The patch for unprintable
>>>> characters is a step in the right direction, but completely ignores
>>>> the caret/ranges and fixits (so things won't line up).
>>>> I was actually hacking on this code from that angle recently; I'll
>>>> send you what I have soon.
>>> Attaching my work-in-progress, which lines up the caret correctly (but
>>> doesn't have your nice color-formatting, and there are still a couple
>>> other issues with making everything line up).
>>> -Eli
>>> <diagencoding.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hacked_up_column_adjustments.patch
Type: application/octet-stream
Size: 11807 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120325/2c1f61b7/attachment.obj>

More information about the cfe-commits mailing list