[cfe-dev] Crash on token spelling for a comment (found in -verify)

David Blaikie dblaikie at gmail.com
Thu Dec 8 20:32:44 PST 2011


While writing some new tests I found a simple example that crashes under
clang's -verify. I'm a little concerned that this could hit real code
(especially in things like IDE plugins, etc that may be more interesting in
the spelling of tokens).

The repro is simple:

// \

this causes the verify mode to try to get the spelling of this comment
token. So long as the following line is empty, this crashes due to going
off-the-end inside the loop around Lexer.cpp:306-310. The
getCharAndSizeNoWarn is called just at the trailing '\' and it cannot
fulfill it's contract as there is no valid character for it to return.
Instead it returns a character off the end of the buffer & an increment
count that puts Ptr one unit /beyond/ 'End'. The loop will now never
satisfy its exit criteria & walks into memory that it shouldn't.

Just wondering if anyone has some nice ideas about how to fix this - I
assume it's rather perf critical so I don't want to go mucking with it too
ham-fistedly. The 'obvious' thing from my perspective would be to do the
walk forward at the previous character rather than when we're actually at
the '\', but this interferes with the fast path. The alternative seems to
be to have getCharAndSize[NoWarn] return a boolean about whether or not it
was able to read a char - but that might have similar problems.

Ideas welcome, otherwise I'll just have a tinker & see what sort of perf
results (any standard clang perf benchmarks would be nice)

- David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111208/a316e6cf/attachment.html>


More information about the cfe-dev mailing list