[PATCH] D109701: [clang] Emit SARIF Diagnostics: Create `clang::SarifDocumentWriter` interface

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Sep 16 07:11:24 PDT 2021


aaron.ballman added inline comments.


================
Comment at: clang/include/clang/Basic/Sarif.h:74
+
+  llvm::Optional<uint32_t> Index;
+  StringRef URI;
----------------
You have a `using namespace llvm;` at the top of the file, so all these `llvm::` nested name specifiers can be removed.


================
Comment at: clang/include/clang/Basic/Sarif.h:77
+
+  SarifArtifactLocation(const StringRef &URI) : Index(), URI(URI) {}
+
----------------
`StringRef` is a non-owning reference object anyway, so there's really no gain from passing a const ref to it -- we typically pass it by value. Same suggestion applies elsewhere in the patch.

Also, no need to explicitly init things with default constructors that will be run, like `Index`.


================
Comment at: clang/include/clang/Basic/Sarif.h:85
+  SarifArtifactLocation &setIndex(uint32_t Idx) {
+    this->Index = Idx;
+    return *this;
----------------
Local style is to not use `this->` unless required, so I'd recommend removing all of the uses (and potentially renaming some parameters so there's not a shadowing issue).


================
Comment at: clang/include/clang/Basic/Sarif.h:95
+//
+/// Since every in clang artifact MUST have a location (there being no nested
+/// artifacts), the creation method \ref SarifArtifact::create requires a
----------------
How do we expect to handle artifact locations that don't correspond directly to a file? For example, the user can specify macros on the command line and those macros could have a diagnostic result associated with them. Can we handle that sort of scenario?


================
Comment at: clang/include/clang/Basic/Sarif.h:97
+/// artifacts), the creation method \ref SarifArtifact::create requires a
+/// \ref SarifArtifactLocation object
+///
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:109
+  SarifArtifactLocation Location;
+  SmallVector<StringRef> Roles;
+
----------------
What size would you like this `SmallVector` to have?


================
Comment at: clang/include/clang/Basic/Sarif.h:112
+  SarifArtifact(const SarifArtifactLocation &Loc)
+      : Offset(), Length(), MimeType(), Location(Loc), Roles() {}
+
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:129
+
+  SarifArtifact &setRoles(const std::initializer_list<StringRef> &Roles) {
+    this->Roles.assign(Roles);
----------------
This is another lightweight nonowning wrapper type that we don't usually pass as a const ref.


================
Comment at: clang/include/clang/Basic/Sarif.h:134
+
+  SarifArtifact &setMimeType(const StringRef &MimeType) {
+    this->MimeType = MimeType;
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:250
+
+  SarifResult() = default;
+
----------------
A default constructed `SarifResult` will have an uninitialized `RuleIdx` -- are you okay with that?


================
Comment at: clang/include/clang/Basic/Sarif.h:270-274
+  SarifResult &setLocations(const ArrayRef<FullSourceRange> &DiagLocs) {
+    this->Locations = DiagLocs;
+    return *this;
+  }
+  SarifResult &setThreadFlows(const ArrayRef<ThreadFlow> &ThreadFlows) {
----------------
Also a nonowning reference type that's meant to be passed by value.


================
Comment at: clang/include/clang/Basic/Sarif.h:285
+///    must ensure that \ref SarifDocumentWriter::createRun is is called before
+///    anyother methods.
+/// 2. If SarifDocumentWriter::endRun is called, callers MUST call
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:297-298
+  /// \internal
+  /// Return a pointer to the current tool. If no run exists, this will
+  /// crash.
+  json::Object *getCurrentTool();
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:302
+  /// \internal
+  /// Checks if there is a run associated with this document
+  ///
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:309
+  /// Reset portions of the internal state so that the document is ready to
+  /// recieve data for a new run
+  void reset();
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:315-316
+  ///
+  /// \note If a run does not exist in the SARIF document, calling this will
+  /// trigger undefined behaviour
+  json::Object *currentRun();
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:322-323
+  ///
+  /// \note If a run does not exist in the SARIF document, calling this will
+  /// trigger undefined behaviour
+  json::Object createCodeFlow(const ArrayRef<ThreadFlow> &ThreadFlows);
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:324
+  /// trigger undefined behaviour
+  json::Object createCodeFlow(const ArrayRef<ThreadFlow> &ThreadFlows);
+
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:326
+
+  /// Add the given threadflows to the ones this SARIF document knows about
+  json::Array createThreadFlows(const ArrayRef<ThreadFlow> &ThreadFlows);
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:327
+  /// Add the given threadflows to the ones this SARIF document knows about
+  json::Array createThreadFlows(const ArrayRef<ThreadFlow> &ThreadFlows);
+
----------------



================
Comment at: clang/include/clang/Basic/Sarif.h:330
+  /// Add the given \ref FullSourceRange to the SARIF document as a physical
+  /// location, with it's corresponding artifact
+  json::Object createPhysicalLocation(const FullSourceRange &R);
----------------
Same suggestions apply elsewhere in the patch.


================
Comment at: clang/include/clang/Basic/Sarif.h:388-389
+private:
+  /// Langauge options to use for the current SARIF document
+  const LangOptions *LangOpts;
+
----------------
I think this should be a value type rather than a possibly null pointer type -- this way, the document can always rely on there being valid language options to check, and if the user provides no custom language options, the default `LangOptions` suffice. Alternatively, it seems reasonable to expect the user to have to pass in a valid language options object in order to create a SARIF document. WDYT?


================
Comment at: clang/include/clang/Basic/Sarif.h:69
+/// Reference:
+/// 1. <a href="https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html#_Toc34317427">artifactLocation object</a>
+/// 2. \ref SarifArtifact
----------------
vaibhav.y wrote:
> I'm not sure how to deal with overlength links in docs directly, turning clang format off & on on comments also seems counter-productive. Would it be okay to add an alias in the doxyfile for the root page of SARIF docs?
> 
> E.g.:
> 
> ```
> ALIASES = sarifDocs="https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html"
> ```
I think it's fine to ignore the clang-format warnings here (without adding comment markers). It's easier for people reading the links to see the full URL here than adding an alias in the doxyfile, IMO.


================
Comment at: clang/include/clang/Basic/SourceLocation.h:473-474
+
+  const FullSourceLoc &getBegin() const { return B; }
+  const FullSourceLoc &getEnd() const { return E; }
+
----------------
More consistent with the other range APIs.


================
Comment at: clang/lib/Basic/Sarif.cpp:198-204
+    const SarifArtifactLocation &location =
+        SarifArtifactLocation::create(FileURI).setIndex(Idx);
+    const SarifArtifact &artifact = SarifArtifact::create(location)
+                                        .setRoles({"resultFile"})
+                                        .setLength(FE->getSize())
+                                        .setMimeType("text/plain");
+    auto statusIter = CurrentArtifacts.insert({FileURI, artifact});
----------------
Same suggestion applies elsewhere in the patch regarding naming conventions.


================
Comment at: clang/lib/Basic/Sarif.cpp:207-209
+    if (statusIter.second) {
+      I = statusIter.first;
+    }
----------------
Our usual coding style elides these too. Btw, you can find the coding style document at: https://llvm.org/docs/CodingStandards.html


================
Comment at: clang/lib/Basic/Sarif.cpp:219-222
+json::Object *SarifDocumentWriter::getCurrentTool() {
+  assert(hasRun() && "Need to call createRun() before using getcurrentTool!");
+  return Runs.back().getAsObject()->get("tool")->getAsObject();
+}
----------------
Should this return a reference rather than a pointer?


================
Comment at: clang/lib/Basic/Sarif.cpp:230-232
+  if (!hasRun()) {
+    return;
+  }
----------------
Is there a reason why we don't want to assert instead?


================
Comment at: clang/lib/Basic/Sarif.cpp:236
+  json::Object &Tool = *getCurrentTool();
+  json::Array Rules{};
+  for (const SarifRule &R : CurrentRules) {
----------------



================
Comment at: clang/lib/Basic/Sarif.cpp:241-243
+    if (!R.HelpURI.empty()) {
+      theRule["helpUri"] = R.HelpURI;
+    }
----------------
Same suggestion applies elsewhere in the patch.


================
Comment at: clang/lib/Basic/Sarif.cpp:278
+json::Array SarifDocumentWriter::createThreadFlows(
+    const ArrayRef<ThreadFlow> &ThreadFlows) {
+  json::Object Ret{{"locations", json::Array{}}};
----------------



================
Comment at: clang/lib/Basic/Sarif.cpp:280
+  json::Object Ret{{"locations", json::Array{}}};
+  json::Array Locs{};
+  for (const auto &ThreadFlow : ThreadFlows) {
----------------



================
Comment at: clang/lib/Basic/Sarif.cpp:292
+json::Object
+SarifDocumentWriter::createCodeFlow(const ArrayRef<ThreadFlow> &ThreadFlows) {
+  return json::Object{{"threadFlows", createThreadFlows(ThreadFlows)}};
----------------



================
Comment at: clang/lib/Basic/Sarif.cpp:299
+  // Clear resources associated with a previous run
+  endRun();
+
----------------
Is there a reason we don't want to assert that the caller has already ended a run before they created a new one?


================
Comment at: clang/lib/Basic/Sarif.cpp:314
+
+bool SarifDocumentWriter::hasRun() const { return Runs.size() != 0; }
+
----------------



================
Comment at: clang/lib/Basic/Sarif.cpp:316
+
+json::Object *SarifDocumentWriter::currentRun() {
+  assert(hasRun() && "SARIF Document has no runs, create a run first!");
----------------
Should this return a reference as well?


================
Comment at: clang/lib/Basic/Sarif.cpp:348
+                   {"ruleId", CurrentRules[RuleIdx].RuleId}};
+  if (Result.Locations.size() != 0) {
+    json::Array Locs{};
----------------



================
Comment at: clang/lib/Basic/Sarif.cpp:349
+  if (Result.Locations.size() != 0) {
+    json::Array Locs{};
+    for (auto &Range : Result.Locations) {
----------------



================
Comment at: clang/lib/Basic/Sarif.cpp:355
+  }
+  if (Result.ThreadFlows.size() != 0) {
+    Ret["codeFlows"] = json::Array{createCodeFlow(Result.ThreadFlows)};
----------------



================
Comment at: clang/lib/Basic/Sarif.cpp:371-373
+  if (Runs.size() > 0) {
+    doc["runs"] = json::Array(Runs);
+  }
----------------



================
Comment at: clang/lib/Basic/SourceLocation.cpp:281
+  OS << '<';
+  auto PrintedLoc = PrintDifference(OS, B.getManager(), B, {});
+  if (B != E) {
----------------
Please spell out the type of `PrintedLoc`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109701/new/

https://reviews.llvm.org/D109701



More information about the cfe-commits mailing list