[PATCH] D109701: [clang] Emit SARIF Diagnostics: Create `clang::SarifDocumentWriter` interface

Mon Jun 27 09:41:34 PDT 2022

vaibhav.y added inline comments.

================
Comment at: clang/lib/Basic/Sarif.cpp:161
+    Region["endColumn"] = adjustColumnPos(
+        R.getEnd(), Lexer::MeasureTokenLength(R.getEnd().getLocWithOffset(0),
+                                              R.getEnd().getManager(), LO));
----------------
aaron.ballman wrote:
> vaibhav.y wrote:
> > aaron.ballman wrote:
> > > I didn't catch this during the review -- but this is a layering violation that caused link errors on some of the build bots. Lexer can call into Basic, but Basic cannot call into Lexer. So we'll need to find a different way to handle this.
> > Would moving the code to Support, having it depend on Basic & Lex work?
> I don't think so -- Support is supposed to be a pretty low-level interface; it currently only relies on LLVM's Support library. I think the layering is supposed to be: Support -> Basic -> Lex.
> 
> As I see it, there are a couple of options as to where this could live. It could live in the Frontend library, as that's where all the diagnostic consumer code in Clang lives. But that library might be a bit "heavy" to pull into other tools (maybe? I don't know). It could also live in AST -- that already links in Basic and Lex. But that feels like a somewhat random place for this to live as this has very little to do with the AST itself.
> 
> Another approach, which might be better, is to require the user of this interface to pass in the token length calculation themselves in the places where it's necessary. e.g., `json::Object whatever(SourceLocation Start, SourceLocation End, unsigned EndLen)` and then you can remove the reliance on the lexer entirely while keeping the interface in Basic. I'm not certain how obnoxious this suggestion is, but I think it's my preferred approach for the moment (but not a strongly held position yet). WDYT of this approach?
I think the approach to injecting the function is better here. I've tried to make the smallest change possiblew with passing in a function whose interface is almost identical to `Lexer::MeasureTokenLength`. The intent was to hint at this being the "canonical metric" for token lengths (with an example in the tests for the same).

I tried passing start, end locs but couldn't find a strong use case yet since `end` would likely always be: `Lexer::getLocForEndOfToken(start, 0)`

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109701/new/

https://reviews.llvm.org/D109701