[PATCH] D109701: [clang] Emit SARIF Diagnostics: Create `clang::SarifDocumentWriter` interface
Aaron Ballman via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Sep 16 07:11:24 PDT 2021
aaron.ballman added inline comments.
================
Comment at: clang/include/clang/Basic/Sarif.h:74
+
+ llvm::Optional<uint32_t> Index;
+ StringRef URI;
----------------
You have a `using namespace llvm;` at the top of the file, so all these `llvm::` nested name specifiers can be removed.
================
Comment at: clang/include/clang/Basic/Sarif.h:77
+
+ SarifArtifactLocation(const StringRef &URI) : Index(), URI(URI) {}
+
----------------
`StringRef` is a non-owning reference object anyway, so there's really no gain from passing a const ref to it -- we typically pass it by value. Same suggestion applies elsewhere in the patch.
Also, no need to explicitly init things with default constructors that will be run, like `Index`.
================
Comment at: clang/include/clang/Basic/Sarif.h:85
+ SarifArtifactLocation &setIndex(uint32_t Idx) {
+ this->Index = Idx;
+ return *this;
----------------
Local style is to not use `this->` unless required, so I'd recommend removing all of the uses (and potentially renaming some parameters so there's not a shadowing issue).
================
Comment at: clang/include/clang/Basic/Sarif.h:95
+//
+/// Since every in clang artifact MUST have a location (there being no nested
+/// artifacts), the creation method \ref SarifArtifact::create requires a
----------------
How do we expect to handle artifact locations that don't correspond directly to a file? For example, the user can specify macros on the command line and those macros could have a diagnostic result associated with them. Can we handle that sort of scenario?
================
Comment at: clang/include/clang/Basic/Sarif.h:97
+/// artifacts), the creation method \ref SarifArtifact::create requires a
+/// \ref SarifArtifactLocation object
+///
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:109
+ SarifArtifactLocation Location;
+ SmallVector<StringRef> Roles;
+
----------------
What size would you like this `SmallVector` to have?
================
Comment at: clang/include/clang/Basic/Sarif.h:112
+ SarifArtifact(const SarifArtifactLocation &Loc)
+ : Offset(), Length(), MimeType(), Location(Loc), Roles() {}
+
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:129
+
+ SarifArtifact &setRoles(const std::initializer_list<StringRef> &Roles) {
+ this->Roles.assign(Roles);
----------------
This is another lightweight nonowning wrapper type that we don't usually pass as a const ref.
================
Comment at: clang/include/clang/Basic/Sarif.h:134
+
+ SarifArtifact &setMimeType(const StringRef &MimeType) {
+ this->MimeType = MimeType;
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:250
+
+ SarifResult() = default;
+
----------------
A default constructed `SarifResult` will have an uninitialized `RuleIdx` -- are you okay with that?
================
Comment at: clang/include/clang/Basic/Sarif.h:270-274
+ SarifResult &setLocations(const ArrayRef<FullSourceRange> &DiagLocs) {
+ this->Locations = DiagLocs;
+ return *this;
+ }
+ SarifResult &setThreadFlows(const ArrayRef<ThreadFlow> &ThreadFlows) {
----------------
Also a nonowning reference type that's meant to be passed by value.
================
Comment at: clang/include/clang/Basic/Sarif.h:285
+/// must ensure that \ref SarifDocumentWriter::createRun is is called before
+/// anyother methods.
+/// 2. If SarifDocumentWriter::endRun is called, callers MUST call
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:297-298
+ /// \internal
+ /// Return a pointer to the current tool. If no run exists, this will
+ /// crash.
+ json::Object *getCurrentTool();
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:302
+ /// \internal
+ /// Checks if there is a run associated with this document
+ ///
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:309
+ /// Reset portions of the internal state so that the document is ready to
+ /// recieve data for a new run
+ void reset();
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:315-316
+ ///
+ /// \note If a run does not exist in the SARIF document, calling this will
+ /// trigger undefined behaviour
+ json::Object *currentRun();
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:322-323
+ ///
+ /// \note If a run does not exist in the SARIF document, calling this will
+ /// trigger undefined behaviour
+ json::Object createCodeFlow(const ArrayRef<ThreadFlow> &ThreadFlows);
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:324
+ /// trigger undefined behaviour
+ json::Object createCodeFlow(const ArrayRef<ThreadFlow> &ThreadFlows);
+
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:326
+
+ /// Add the given threadflows to the ones this SARIF document knows about
+ json::Array createThreadFlows(const ArrayRef<ThreadFlow> &ThreadFlows);
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:327
+ /// Add the given threadflows to the ones this SARIF document knows about
+ json::Array createThreadFlows(const ArrayRef<ThreadFlow> &ThreadFlows);
+
----------------
================
Comment at: clang/include/clang/Basic/Sarif.h:330
+ /// Add the given \ref FullSourceRange to the SARIF document as a physical
+ /// location, with it's corresponding artifact
+ json::Object createPhysicalLocation(const FullSourceRange &R);
----------------
Same suggestions apply elsewhere in the patch.
================
Comment at: clang/include/clang/Basic/Sarif.h:388-389
+private:
+ /// Langauge options to use for the current SARIF document
+ const LangOptions *LangOpts;
+
----------------
I think this should be a value type rather than a possibly null pointer type -- this way, the document can always rely on there being valid language options to check, and if the user provides no custom language options, the default `LangOptions` suffice. Alternatively, it seems reasonable to expect the user to have to pass in a valid language options object in order to create a SARIF document. WDYT?
================
Comment at: clang/include/clang/Basic/Sarif.h:69
+/// Reference:
+/// 1. <a href="https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html#_Toc34317427">artifactLocation object</a>
+/// 2. \ref SarifArtifact
----------------
vaibhav.y wrote:
> I'm not sure how to deal with overlength links in docs directly, turning clang format off & on on comments also seems counter-productive. Would it be okay to add an alias in the doxyfile for the root page of SARIF docs?
>
> E.g.:
>
> ```
> ALIASES = sarifDocs="https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html"
> ```
I think it's fine to ignore the clang-format warnings here (without adding comment markers). It's easier for people reading the links to see the full URL here than adding an alias in the doxyfile, IMO.
================
Comment at: clang/include/clang/Basic/SourceLocation.h:473-474
+
+ const FullSourceLoc &getBegin() const { return B; }
+ const FullSourceLoc &getEnd() const { return E; }
+
----------------
More consistent with the other range APIs.
================
Comment at: clang/lib/Basic/Sarif.cpp:198-204
+ const SarifArtifactLocation &location =
+ SarifArtifactLocation::create(FileURI).setIndex(Idx);
+ const SarifArtifact &artifact = SarifArtifact::create(location)
+ .setRoles({"resultFile"})
+ .setLength(FE->getSize())
+ .setMimeType("text/plain");
+ auto statusIter = CurrentArtifacts.insert({FileURI, artifact});
----------------
Same suggestion applies elsewhere in the patch regarding naming conventions.
================
Comment at: clang/lib/Basic/Sarif.cpp:207-209
+ if (statusIter.second) {
+ I = statusIter.first;
+ }
----------------
Our usual coding style elides these too. Btw, you can find the coding style document at: https://llvm.org/docs/CodingStandards.html
================
Comment at: clang/lib/Basic/Sarif.cpp:219-222
+json::Object *SarifDocumentWriter::getCurrentTool() {
+ assert(hasRun() && "Need to call createRun() before using getcurrentTool!");
+ return Runs.back().getAsObject()->get("tool")->getAsObject();
+}
----------------
Should this return a reference rather than a pointer?
================
Comment at: clang/lib/Basic/Sarif.cpp:230-232
+ if (!hasRun()) {
+ return;
+ }
----------------
Is there a reason why we don't want to assert instead?
================
Comment at: clang/lib/Basic/Sarif.cpp:236
+ json::Object &Tool = *getCurrentTool();
+ json::Array Rules{};
+ for (const SarifRule &R : CurrentRules) {
----------------
================
Comment at: clang/lib/Basic/Sarif.cpp:241-243
+ if (!R.HelpURI.empty()) {
+ theRule["helpUri"] = R.HelpURI;
+ }
----------------
Same suggestion applies elsewhere in the patch.
================
Comment at: clang/lib/Basic/Sarif.cpp:278
+json::Array SarifDocumentWriter::createThreadFlows(
+ const ArrayRef<ThreadFlow> &ThreadFlows) {
+ json::Object Ret{{"locations", json::Array{}}};
----------------
================
Comment at: clang/lib/Basic/Sarif.cpp:280
+ json::Object Ret{{"locations", json::Array{}}};
+ json::Array Locs{};
+ for (const auto &ThreadFlow : ThreadFlows) {
----------------
================
Comment at: clang/lib/Basic/Sarif.cpp:292
+json::Object
+SarifDocumentWriter::createCodeFlow(const ArrayRef<ThreadFlow> &ThreadFlows) {
+ return json::Object{{"threadFlows", createThreadFlows(ThreadFlows)}};
----------------
================
Comment at: clang/lib/Basic/Sarif.cpp:299
+ // Clear resources associated with a previous run
+ endRun();
+
----------------
Is there a reason we don't want to assert that the caller has already ended a run before they created a new one?
================
Comment at: clang/lib/Basic/Sarif.cpp:314
+
+bool SarifDocumentWriter::hasRun() const { return Runs.size() != 0; }
+
----------------
================
Comment at: clang/lib/Basic/Sarif.cpp:316
+
+json::Object *SarifDocumentWriter::currentRun() {
+ assert(hasRun() && "SARIF Document has no runs, create a run first!");
----------------
Should this return a reference as well?
================
Comment at: clang/lib/Basic/Sarif.cpp:348
+ {"ruleId", CurrentRules[RuleIdx].RuleId}};
+ if (Result.Locations.size() != 0) {
+ json::Array Locs{};
----------------
================
Comment at: clang/lib/Basic/Sarif.cpp:349
+ if (Result.Locations.size() != 0) {
+ json::Array Locs{};
+ for (auto &Range : Result.Locations) {
----------------
================
Comment at: clang/lib/Basic/Sarif.cpp:355
+ }
+ if (Result.ThreadFlows.size() != 0) {
+ Ret["codeFlows"] = json::Array{createCodeFlow(Result.ThreadFlows)};
----------------
================
Comment at: clang/lib/Basic/Sarif.cpp:371-373
+ if (Runs.size() > 0) {
+ doc["runs"] = json::Array(Runs);
+ }
----------------
================
Comment at: clang/lib/Basic/SourceLocation.cpp:281
+ OS << '<';
+ auto PrintedLoc = PrintDifference(OS, B.getManager(), B, {});
+ if (B != E) {
----------------
Please spell out the type of `PrintedLoc`.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D109701/new/
https://reviews.llvm.org/D109701
More information about the cfe-commits
mailing list