[cfe-dev] [RFC] Emit SARIF Diagnostics via -fdiagnostics-format=sarif

Vaibhav Yenamandra (BLOOMBERG/ 919 3RD A) via cfe-dev cfe-dev at lists.llvm.org
Thu Mar 11 09:59:55 PST 2021

Hello Everyone,

Below is an RFC on extending the clang `-fdiagnostics-format` option's to
let clang to emit machine readable json diagnostics. Feedback is highly appreciated!

# Why
Machine consumable diagnostics are important for writing generic static
analysis wrappers and harnesses that want to interact with code bases through
clang, There are two options to consider for the diagnostic format to use in

1. Mimic `gcc-9 -fdiagnostics-format=json`, covered in the previous work section
2. Emit [SARIF][0] diagnostic information, a cross-language standardized format
   that is already supported in `clang/lib/StaticAnalyzer` (through `--analyzer-output=sarif`)

We propose (2) as it is a standardized format, which should make it easier for tools to
implement support for it.

## Previous Work

### `gcc-9 -fdiagnostics-format=json`
GCC [recently][1] [implemented][2] serializing diagnostics to JSON. This option
could be implemented as a `-fdiagnostics-format=json-gcc` in clang to signal
users of its intended interoperability with the corresponding gcc option.
The schema for this format may be inferred from [current gcc code][3].

While not community standard, it can be expected to be reasonably stable as the
[original patch][2] states the flag emits machine readable diagnostics.

## SARIF diagnostics in LLVM

[SARIF][0] (Static Analysis Results Interchange Format) is a standard format
for the output for static analysis tools.

Clang StaticAnalyzer already implements a SARIF diagnostic consumer in
[D53814][4], this should allow us to implement (necessary, if any) extra fields
to the diagnostics output

### Mapping clang diagnostics to SARIF

This section assumes the typical compiler diagnostic which looks like what is
provided in the [expressive diagnostics page][5]

In SARIF, the attributes can be mapped to the [`results`][7] property as follows:
1. File name where the diagnostic occurs is relocated to the [`physicalLocation`][8]
2. Line/Column of the caret marking the error can be stored in the [`region`][9]
   property, this can also encode the source range to which an error corresponds
3. The error message can be transferred to the [`message`][10]
4. Each of the locations can store the rendered caret & snippet from clang using the
   [`snippet`][12] property for that region
5. Nested diagnostics (typically `note` level items) can be represented using the
   [`locationRelationShip`][14] object 
6. Fixit hints can be communicated through the [`fixes`][13] property 

## Interface Changes

We propose the following interface changes:

- Input: Extend the `-fdiagnostics-format` flag to recognize: `-fdiagnostics-format=sarif`
- Output: Clang will emit SARIF formatted diagnostics when `-fdiagnostics-format=sarif` is provided.

## Diagnostic Examples

Various examples for what are available on this github gist (which also renders this message in markdown): https://gist.github.com/envp/3a5fdd33115b91c391c22e5e8a5210f4#diagnostic-examples

[0]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html
[1]: https://developers.redhat.com/blog/2019/03/08/usability-improvements-in-gcc-9
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=478dd60ddcf17773ebd1af367c9dcaee2401f797
[3]: https://github.com/gcc-mirror/gcc/blob/master/gcc/diagnostic-format-json.cc
[4]: https://reviews.llvm.org/D53814
[5]: https://clang.llvm.org/diagnostics.html
[6]: https://github.com/microsoft/sarif-tutorials/blob/main/docs/2-Basics.md#results
[7]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc16012463
[8]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc16012634
[9]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc16012641
[10]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc16012655
[11]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc16012632
[12]: https://docs.oasis-open.org/sarif/sarif/v2.0/csprd02/sarif-v2.0-csprd02.html#_Toc10127889
[13]: https://docs.oasis-open.org/sarif/sarif/v2.0/csprd02/sarif-v2.0-csprd02.html#_Toc10128072
[14]: https://docs.oasis-open.org/sarif/sarif/v2.0/csprd02/sarif-v2.0-csprd02.html#_Toc10127919

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210311/98576929/attachment-0001.html>

More information about the cfe-dev mailing list