[PATCH] D109701: [clang] Emit SARIF Diagnostics: Create `clang::SarifDocumentWriter` interface

Vaibhav Yenamandra via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 29 09:56:53 PST 2021


vaibhav.y added inline comments.


================
Comment at: clang/include/clang/Basic/Sarif.h:80-82
+  static SarifArtifactLocation create(StringRef URI) {
+    return SarifArtifactLocation{URI};
+  }
----------------
aaron.ballman wrote:
> One thing that's worth calling out is that `StringRef` is non-owning which means that the argument passed to create the `SarifArtifactLocation` has to outlive the returned object to avoid dangling references. This makes the class a bit more dangerous to use because `Twine` or automatic `std::string` objects may cause lifetime concerns.
> 
> Should these classes be storing a `std::string` so that the memory is owned by SARIF?
Good point! Will change it to `std::string` to start with.

Some fields such as MimeType, RuleID would probably do better with `SmallString`, I haven't been able to find any good measurements on how the lengths of those two are distributed, can it be made into `SmallString<N>` in a future PR?


================
Comment at: clang/include/clang/Basic/Sarif.h:116
+  static SarifArtifact create(const SarifArtifactLocation &Loc) {
+    return SarifArtifactLocation{Loc};
+  }
----------------
aaron.ballman wrote:
> `Loc` is already a `SarifArtifactLocation`, did you mean `SarifArtifact` by any chance?
> 
> (Note, this suggests to me we should mark the ctor's here as `explicit`.)
Agree, I've marked all constructors taking a single parameter as explicit.


================
Comment at: clang/include/clang/Basic/Sarif.h:243-245
+  // While this cannot be negative, since this type needs to be serialized
+  // to JSON, it needs to be `int64_t`. The best we can do is assert that
+  // a negative value isn't used to create it
----------------
aaron.ballman wrote:
> This comment looks incorrect to me.
Ack, fixed that as well. the right rationale for `uint32_t` is that it is the largest non-negative type that can be safely promoted to `int64_t` (which is what LLVM's json encoder supports)


================
Comment at: clang/include/clang/Basic/SourceLocation.h:441-456
+    bool operator()(const FullSourceLoc &lhs, const FullSourceLoc &rhs) const {
       return lhs.isBeforeInTranslationUnitThan(rhs);
     }
   };
 
   /// Prints information about this FullSourceLoc to stderr.
   ///
----------------
aaron.ballman wrote:
> Looks like unrelated formatting changes; feel free to submit these as an NFC change if you'd like, but the changes should be reverted from this patch for clarity.
Ack, these are definitely a result of a bad rebase. 


================
Comment at: clang/lib/Basic/Sarif.cpp:199
+    const SarifArtifactLocation &Location =
+        SarifArtifactLocation::create(FileURI).setIndex(Idx);
+    const SarifArtifact &Artifact = SarifArtifact::create(Location)
----------------
aaron.ballman wrote:
> This seems like it'll be a use-after-free because the local `std::string` will be destroyed before the lifetime of the `SarifArtifactLocation` ends.
Will run it through a pass of asan & msan, is the best way to add: `-fsanitize=memory -fsanitize=address` to the test CMakeLists.txt & run them?

I've changed all strings to `std::string`, so this one should no longer be a problem but I wonder if there's any others I have missed as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109701/new/

https://reviews.llvm.org/D109701



More information about the cfe-commits mailing list