[cfe-dev] [RFC] Emit SARIF Diagnostics via -fdiagnostics-format=sarif

Vaibhav Yenamandra (BLOOMBERG/ 919 3RD A) via cfe-dev cfe-dev at lists.llvm.org
Thu Mar 18 09:06:03 PDT 2021


Ping.

Are there any other objections or comments on this RFC?

Original message rendered here: https://gist.github.com/envp/3a5fdd33115b91c391c22e5e8a5210f4

From: aaron at aaronballman.com At: 03/11/21 13:23:02To:  Vaibhav Yenamandra (BLOOMBERG/ 919 3RD A ) 
Cc:  Daniel Ruoso (BLOOMBERG/ 919 3RD A ) ,  Daniel Beer (BLOOMBERG/ 919 3RD A ) ,  cfe-dev at lists.llvm.org
Subject: Re: [cfe-dev] [RFC] Emit SARIF Diagnostics via -fdiagnostics-format=sarif

On Thu, Mar 11, 2021 at 1:00 PM Vaibhav Yenamandra (BLOOMBERG/ 919 3RD
A) via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> Hello Everyone,
>
> Below is an RFC on extending the clang `-fdiagnostics-format` option's to
> let clang to emit machine readable json diagnostics. Feedback is highly 
appreciated!
>
> # Why
> Machine consumable diagnostics are important for writing generic static
> analysis wrappers and harnesses that want to interact with code bases through
> clang, There are two options to consider for the diagnostic format to use in
> clang:
>
> 1. Mimic `gcc-9 -fdiagnostics-format=json`, covered in the previous work 
section
> 2. Emit [SARIF][0] diagnostic information, a cross-language standardized 
format
> that is already supported in `clang/lib/StaticAnalyzer` (through 
`--analyzer-output=sarif`)
>
> We propose (2) as it is a standardized format, which should make it easier 
for tools to
> implement support for it.

I'd support option #2 -- SARIF has a lot of nice tooling support
that's forming in the industry (such as
https://docs.github.com/en/github/finding-security-vulnerabilities-and-errors-in
-your-code/uploading-a-sarif-file-to-github).
I'm not super excited about #1 given the existence of #2.

> ## Previous Work
>
> ### `gcc-9 -fdiagnostics-format=json`
> GCC [recently][1] [implemented][2] serializing diagnostics to JSON. This 
option
> could be implemented as a `-fdiagnostics-format=json-gcc` in clang to signal
> users of its intended interoperability with the corresponding gcc option.
> The schema for this format may be inferred from [current gcc code][3].
>
> While not community standard, it can be expected to be reasonably stable as 
the
> [original patch][2] states the flag emits machine readable diagnostics.
>
> ## SARIF diagnostics in LLVM
>
> [SARIF][0] (Static Analysis Results Interchange Format) is a standard format
> for the output for static analysis tools.
>
> Clang StaticAnalyzer already implements a SARIF diagnostic consumer in
> [D53814][4], this should allow us to implement (necessary, if any) extra 
fields
> to the diagnostics output
>
> ### Mapping clang diagnostics to SARIF
>
> This section assumes the typical compiler diagnostic which looks like what is
> provided in the [expressive diagnostics page][5]
>
> In SARIF, the attributes can be mapped to the [`results`][7] property as 
follows:
> 1. File name where the diagnostic occurs is relocated to the 
[`physicalLocation`][8]
> property
> 2. Line/Column of the caret marking the error can be stored in the 
[`region`][9]
> property, this can also encode the source range to which an error corresponds
> 3. The error message can be transferred to the [`message`][10]
> 4. Each of the locations can store the rendered caret & snippet from clang 
using the
> [`snippet`][12] property for that region
> 5. Nested diagnostics (typically `note` level items) can be represented using 
the
> [`locationRelationShip`][14] object
> 6. Fixit hints can be communicated through the [`fixes`][13] property

This looks sensible to me.

~Aaron

> ## Interface Changes
>
> We propose the following interface changes:
>
> - Input: Extend the `-fdiagnostics-format` flag to recognize: 
`-fdiagnostics-format=sarif`
> - Output: Clang will emit SARIF formatted diagnostics when 
`-fdiagnostics-format=sarif` is provided.
>
> ## Diagnostic Examples
>
> Various examples for what are available on this github gist (which also 
renders this message in markdown): 
https://gist.github.com/envp/3a5fdd33115b91c391c22e5e8a5210f4#diagnostic-example
s
>
>
> [0]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html
> [1]: 
https://developers.redhat.com/blog/2019/03/08/usability-improvements-in-gcc-9
> [2]: 
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=478dd60ddcf17773ebd1af367c9dca
ee2401f797
> [3]: 
https://github.com/gcc-mirror/gcc/blob/master/gcc/diagnostic-format-json.cc
> [4]: https://reviews.llvm.org/D53814
> [5]: https://clang.llvm.org/diagnostics.html
> [6]: 
https://github.com/microsoft/sarif-tutorials/blob/main/docs/2-Basics.md#results
> [7]: 
https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc1
6012463
> [8]: 
https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc1
6012634
> [9]: 
https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc1
6012641
> [10]: 
https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc1
6012655
> [11]: 
https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc1
6012632
> [12]: 
https://docs.oasis-open.org/sarif/sarif/v2.0/csprd02/sarif-v2.0-csprd02.html#_To
c10127889
> [13]: 
https://docs.oasis-open.org/sarif/sarif/v2.0/csprd02/sarif-v2.0-csprd02.html#_To
c10128072
> [14]: 
https://docs.oasis-open.org/sarif/sarif/v2.0/csprd02/sarif-v2.0-csprd02.html#_To
c10127919
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210318/f717ffe9/attachment.html>


More information about the cfe-dev mailing list