[PATCH] D131632: [clang] Enable output of SARIF diagnostics

Christopher Di Bella via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Aug 22 11:34:30 PDT 2022


cjdb added a comment.

This looks great, thank you so much @abrahamcd!



================
Comment at: clang/include/clang/Frontend/SARIFDiagnosticPrinter.h:31
+class SARIFDiagnosticPrinter : public DiagnosticConsumer {
+  raw_ostream &OS;
+  IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts;
----------------
Please make OS a pointer instead of a reference, since member references make usage very unpleasant.


================
Comment at: clang/include/clang/Frontend/SARIFDiagnosticPrinter.h:31-42
+  raw_ostream &OS;
+  IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts;
+
+  /// Handle to the currently active text diagnostic emitter.
+  std::unique_ptr<SARIFDiagnostic> SARIFDiag;
+
+  /// A string to prefix to error messages.
----------------
cjdb wrote:
> Please make OS a pointer instead of a reference, since member references make usage very unpleasant.
Nit: please move all private members below the public interface, where possible.


================
Comment at: clang/include/clang/Frontend/SARIFDiagnosticPrinter.h:42
+
+  unsigned OwnsOutputStream : 1;
+
----------------
Can we make this a bool instead? Unless I'm mistaken, there's 31 bits of padding for `OwnsOutputStream`, as opposed to a bool's (almost certainly) eight bits.


================
Comment at: clang/include/clang/Frontend/SARIFDiagnosticPrinter.h:52
+  /// used.
+  void setPrefix(std::string Value) {
+    Prefix = std::move(Value);
----------------
We should be passing in a `StringRef` rather than a string, since this will unnecessarily construct a string for C string literals.


================
Comment at: clang/lib/Frontend/FrontendAction.cpp:724-727
+    if (CI.getDiagnosticOpts().getFormat() == DiagnosticOptions::SARIF) {
+      auto *Writer = new SarifDocumentWriter(CI.getSourceManager());
+      static_cast<SARIFDiagnosticPrinter *>(&CI.getDiagnosticClient())
+          ->setSarifWriter(Writer);
----------------
We shouldn't have raw owning pointers, since they're vulnerable to memory leaks. Even though LLVM doesn't throw exceptions, it's still good practice to avoid them.


================
Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:30-31
+#include <string>
+
+using namespace clang;
+
----------------
Please replace this using-directive with a namespace scope.


================
Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:78
+            emitFilename(FE->getName(), Loc.getManager());
+        // FIXME: No current way to add file-only location to SARIF object
+      }
----------------
I think it would be good to file an issue on GitHub and change to `FIXME(llvm-project/<issue>)`. @aaron.ballman WDYT?


================
Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:86-88
+  for (ArrayRef<CharSourceRange>::const_iterator RI = Ranges.begin(),
+                                                 RE = Ranges.end();
+       RI != RE; ++RI) {
----------------
I haven't seen any use of `RI` or `RE` in this loop, except to dereference `RI`, so we should change this to a range-for loop.


================
Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:162
+    break;
+  llvm_unreachable("Not an existing diagnostic level");
+  }
----------------
aaron.ballman wrote:
> 



================
Comment at: clang/lib/Frontend/SARIFDiagnostic.cpp:210-223
+/// ranges necessary.
+void SARIFDiagnostic::emitDiagnosticLoc(FullSourceLoc Loc, PresumedLoc PLoc,
+                                        DiagnosticsEngine::Level Level,
+                                        ArrayRef<CharSourceRange> Ranges) {}
+
+void SARIFDiagnostic::emitIncludeLocation(FullSourceLoc Loc, PresumedLoc PLoc) {
+}
----------------
If these aren't called, then they should probably assert and say that they're not implemented.


================
Comment at: clang/lib/Frontend/SARIFDiagnosticPrinter.cpp:24-25
+#include "llvm/Support/raw_ostream.h"
+#include <algorithm>
+using namespace clang;
+
----------------
Please replace with a namespace scope.


================
Comment at: clang/lib/Frontend/SARIFDiagnosticPrinter.cpp:71
+  // other infrastructure necessary when emitting more rich diagnostics.
+  if (!Info.getLocation().isValid()) { // TODO: What is this case? 
+    // SARIFDiag->addDiagnosticWithoutLocation(
----------------
aaron.ballman wrote:
> vaibhav.y wrote:
> > vaibhav.y wrote:
> > > The location maybe if the diagnostic's source is located in the scratch buffer. Likely for macro expansions where token pasting is involved. Another case would be errors on the command line.
> > > 
> > > I'm not entirely sure how the SARIF spec would handle this case, it might require an extension.
> > > 
> > > A few ways that might work could be:
> > > 
> > > Using the [[ https://docs.oasis-open.org/sarif/sarif/v2.0/csprd02/sarif-v2.0-csprd02.html#_Toc10127692 | logicalLocations ]] property to specify ([[ https://docs.oasis-open.org/sarif/sarif/v2.0/csprd02/sarif-v2.0-csprd02.html#_Toc10127910 | logicalLocation object ]]), this might need an extension for kind: "macro", another case that might need extension is diagnostics about invalid command line flags which are also diagnostics without a valid
> > > 
> > > The parentIndex for these logical locations could be set to the physical location that produced them.
> > > 
> > > I think this definitely warrants some discussion since the spec doesn't provide a clear way to express these cases.
> > > 
> > > WDYT @aaron.ballman @cjdb @denik 
> > The spec does say for "kind":
> > 
> > > If none of those strings accurately describes the construct, kind MAY contain any value specified by the analysis tool.
> > 
> > So an extension might not be necessary, but might be worth discussing.
> From looking through the spec, I think `logicalLocations` is probably the right choice and we'd want to make up our own kind for things like the scratch buffer or the command line. I think an extension would be worth discussing.
We should defer this to a future CL, so that Abraham isn't blocked by our decision-making (and so we can make the right decision). I can start a GitHub issue to get the discussion in a good spot?


================
Comment at: clang/test/Frontend/sarif-diagnostics.cpp:3
+// RUN: FileCheck -dump-input=always %s --input-file=%t
+// CHECK: {"$schema":"https://docs.oasis-open.org/sarif/sarif/v2.1.0/cos02/schemas/sarif-schema-2.1.0.json","runs":[{"artifacts":[{"length":2953,"location":{"index":0,"uri":"file:///usr/local/google/home/abrahamcd/projects/llvm-project/clang/test/Frontend/sarif-diagnostics.cpp"},"mimeType":"text/plain","roles":["resultFile"]}],"columnKind":"unicodeCodePoints","results":[{"locations":[{"physicalLocation":{"artifactLocation":{"index":0},"region":{"endColumn":1,"startColumn":1,"startLine":5}}}],"message":{"text":"'main' must return 'int'"},"ruleId":"3463","ruleIndex":0},{"locations":[{"physicalLocation":{"artifactLocation":{"index":0},"region":{"endColumn":11,"startColumn":11,"startLine":6}}}],"message":{"text":"use of undeclared identifier 'hello'"},"ruleId":"4601","ruleIndex":1},{"locations":[{"physicalLocation":{"artifactLocation":{"index":0},"region":{"endColumn":17,"startColumn":17,"startLine":8}}}],"message":{"text":"invalid digit 'a' in decimal constant"},"ruleId":"898","ruleIndex":2},{"locations":[{"physicalLocation":{"artifactLocation":{"index":0},"region":{"endColumn":5,"startColumn":5,"startLine":12}}}],"message":{"text":"misleading indentation; statement is not part of the previous 'if'"},"ruleId":"1806","ruleIndex":3},{"locations":[{"physicalLocation":{"artifactLocation":{"index":0},"region":{"endColumn":3,"startColumn":3,"startLine":10}}}],"message":{"text":"previous statement is here"},"ruleId":"1730","ruleIndex":4},{"locations":[{"physicalLocation":{"artifactLocation":{"index":0},"region":{"endColumn":10,"startColumn":10,"startLine":11}}}],"message":{"text":"unused variable 'Yes'"},"ruleId":"6536","ruleIndex":5},{"locations":[{"physicalLocation":{"artifactLocation":{"index":0},"region":{"endColumn":12,"startColumn":12,"startLine":14}}}],"message":{"text":"use of undeclared identifier 'hi'"},"ruleId":"4601","ruleIndex":6},{"locations":[{"physicalLocation":{"artifactLocation":{"index":0},"region":{"endColumn":1,"startColumn":1,"startLine":16}}}],"message":{"text":"extraneous closing brace ('}')"},"ruleId":"1399","ruleIndex":7}],"tool":{"driver":{"fullName":"","informationUri":"https://clang.llvm.org/docs/UsersManual.html","language":"en-US","name":"clang","rules":[{"fullDescription":{"text":""},"id":"3463","name":""},{"fullDescription":{"text":""},"id":"4601","name":""},{"fullDescription":{"text":""},"id":"898","name":""},{"fullDescription":{"text":""},"id":"1806","name":""},{"fullDescription":{"text":""},"id":"1730","name":""},{"fullDescription":{"text":""},"id":"6536","name":""},{"fullDescription":{"text":""},"id":"4601","name":""},{"fullDescription":{"text":""},"id":"1399","name":""}],"version":"16.0.0"}}}],"version":"2.1.0"}
+
----------------
abrahamcd wrote:
> In regards to testing this, my understanding is that we are moving away from the unit test GTest testing, correct? I included a FileCheck implementation as well, but I wasn't sure how effective just checking the whole SARIF object like this would be.
> In regards to testing this, my understanding is that we are moving away from the unit test GTest testing, correct?

Yep, that's correct.

> I included a FileCheck implementation as well, but I wasn't sure how effective just checking the whole SARIF object like this would be.

Isn't this the FileCheck test?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131632/new/

https://reviews.llvm.org/D131632



More information about the cfe-commits mailing list