[clang] [clang][Diagnostics] Highlight code snippets (PR #66514)
Aaron Ballman via cfe-commits
cfe-commits at lists.llvm.org
Thu Jan 18 08:38:53 PST 2024
Timm =?utf-8?q?Bäder?= <tbaeder at redhat.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/66514 at github.com>
================
@@ -1112,6 +1121,140 @@ prepareAndFilterRanges(const SmallVectorImpl<CharSourceRange> &Ranges,
return LineRanges;
}
+/// Creates syntax highlighting information in form of StyleRanges.
+///
+/// The returned unique ptr has always exactly size
+/// (\p EndLineNumber - \p StartLineNumber + 1). Each SmallVector in there
+/// corresponds to syntax highlighting information in one line. In each line,
+/// the StyleRanges are non-overlapping and sorted from start to end of the
+/// line.
+std::unique_ptr<llvm::SmallVector<TextDiagnostic::StyleRange>[]>
+highlightLines(StringRef FileData, unsigned StartLineNumber,
+ unsigned EndLineNumber, const Preprocessor *PP,
+ const LangOptions &LangOpts, uint32_t MaxHighlightFileSize,
+ FileID FID, const SourceManager &SM) {
+ assert(StartLineNumber <= EndLineNumber);
+ auto SnippetRanges =
+ std::make_unique<SmallVector<TextDiagnostic::StyleRange>[]>(
+ EndLineNumber - StartLineNumber + 1);
+
+ if (!PP)
+ return SnippetRanges;
+
+ // Might cause emission of another diagnostic.
+ if (PP->getIdentifierTable().getExternalIdentifierLookup())
+ return SnippetRanges;
+
+ auto Buff = llvm::MemoryBuffer::getMemBuffer(FileData);
+ if (Buff->getBufferSize() > MaxHighlightFileSize)
+ return SnippetRanges;
+
+ Lexer L{FID, *Buff, SM, LangOpts};
+ L.SetKeepWhitespaceMode(true);
+
+ // Classify the given token and append it to the given vector.
+ auto appendStyle =
+ [PP, &LangOpts](SmallVector<TextDiagnostic::StyleRange> &Vec,
+ const Token &T, unsigned Start, unsigned Length) -> void {
+ if (T.is(tok::raw_identifier)) {
+ StringRef RawIdent = T.getRawIdentifier();
+ // Special case true/false/nullptr literals, since they will otherwise be
+ // treated as keywords.
+ if (RawIdent == "true" || RawIdent == "false" || RawIdent == "nullptr") {
----------------
AaronBallman wrote:
There's not a programmatic way to obtain those, but perhaps we could change the macros usable with TokenKinds.def to expose that. I was thinking we'd manually encode them here, but a programmatic way is more futureproof. I'd be comfortable going with your preference; if you do manually encode them, I'd recommend switching to a `StringSwitch` instead of a series of `||`.
https://github.com/llvm/llvm-project/pull/66514
More information about the cfe-commits
mailing list