[clang] [clang][Diagnostics] Highlight code snippets (PR #66514)

via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 20 12:29:23 PDT 2023


Timm =?utf-8?q?Bäder?= <tbaeder at redhat.com>,
Timm =?utf-8?q?Bäder?= <tbaeder at redhat.com>
Message-ID:
In-Reply-To: <llvm/llvm-project/pull/66514/clang at github.com>


================
@@ -0,0 +1,77 @@
+
+#include "clang/Frontend/CodeSnippetHighlighter.h"
+#include "clang/Basic/DiagnosticOptions.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Lex/Lexer.h"
+#include "clang/Lex/Preprocessor.h"
+#include "clang/Lex/PreprocessorOptions.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace clang;
+
+static SourceManager createTempSourceManager() {
+  FileSystemOptions FileOpts;
+  FileManager FileMgr(FileOpts);
+  llvm::IntrusiveRefCntPtr<DiagnosticIDs> DiagIDs(new DiagnosticIDs());
+  llvm::IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts(new DiagnosticOptions());
+  DiagnosticsEngine diags(DiagIDs, DiagOpts);
+  return SourceManager(diags, FileMgr);
+}
+
+static Lexer createTempLexer(llvm::MemoryBufferRef B, SourceManager &FakeSM,
+                             const LangOptions &LangOpts) {
+  return Lexer(FakeSM.createFileID(B), B, FakeSM, LangOpts);
+}
+
+std::vector<StyleRange> CodeSnippetHighlighter::highlightLine(
+    StringRef SourceLine, const Preprocessor *PP, const LangOptions &LangOpts) {
+  if (!PP)
+    return {};
+  constexpr raw_ostream::Colors CommentColor = raw_ostream::BLACK;
+  constexpr raw_ostream::Colors LiteralColor = raw_ostream::GREEN;
+  constexpr raw_ostream::Colors KeywordColor = raw_ostream::YELLOW;
+
+  SourceManager FakeSM = createTempSourceManager();
+  const auto MemBuf = llvm::MemoryBuffer::getMemBuffer(SourceLine);
+  Lexer L = createTempLexer(MemBuf->getMemBufferRef(), FakeSM, LangOpts);
+  L.SetKeepWhitespaceMode(true);
----------------
cor3ntin wrote:

I can think of 3 cases: line splicing, multi-line comments, raw strings.

 - Line splicing are easy enough to find.
 - There are very few places that will warn/error about the content of multi line comments, and we could have a "stop coloring" RAII object for the places that do.
 - There is the interesting case of comments ending on a line. maybe we need to search for `*/` before lexing?
 - If the comment begin on the line... i guess we have to lex not just the line but everything after.

So that left us with raw strings.
How reasonable would it be for the lexer to maintain a list of SourceRange for raw strings?


https://github.com/llvm/llvm-project/pull/66514


More information about the cfe-commits mailing list