[cfe-dev] [analyzer] Project to output SARIF

Tue Sep 18 14:55:13 PDT 2018

Gábor:

>
> I was wondering what are the constraints with the plist format that 
> makes you want to output Sarif natively. I am not opposed to adding 
> another output format but if you miss something from the plist during 
> conversion, chances are good other consumers of the plist would find 
> that information useful. As far as I understand, the current plist 
> format is not frozen in any ways, so extending that is also an option 
> (regardless of adding a new output format or not ).
That's a good point. My understanding (I concede that my knowledge of 
the details is second hand), is that the plist format is quite tightly 
tied to the way the paths are visualized within XCode. Other viewers 
like to show path visualizations in other ways, but supporting those 
that didn't seem to be possible in plist. In particular, there are good 
reasons to have different levels of "importance" associated with points 
and edges. Also, there didn't seem to be anything in plist that would 
allow you to express more than one thread. Finally, there is lots of 
metadata about the analysis that can't be expressed in plist.

I certainly agree that it would have been possible to extend plist in 
many directions to compensate, but our feeling is that we would just end 
up with something that approached the expressiveness of SARIF, so it 
makes more sense to just do SARIF natively.

Also, Cppcheck outputs plist too, so extending it for Clang (if not done 
in a backwards-compatible way) would break any consumers of output from 
that tool.

I hope this helps,

-Paul

>
> Regards,
> Gabor
>
> On Mon, 17 Sep 2018 at 14:36, Artem Dergachev via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     Hmm, this looks useful. I'd love to see how we perform compared to
>     other
>     tools and have a look at interesting false negatives that such
>     comparison would be able to find, though i understand that this
>     sort of
>     comparisons are hard because different tools may report the same
>     bug in
>     different manners, on different lines of code, with different
>     warnings
>     and notes, so even if they provide it in the same format, matching
>     them
>     to each other automatically may be hard.
>
>     Analyzer outputs are implemented by PathDiagnosticConsumer
>     sub-classes,
>     and it should be fairly straightforward to add a new sub-class.
>     You need
>     to handle different "diagnostic pieces" (events along the path,
>     directions on how does the path run through the program, etc.) Please
>     let us know if you think that the class is not receiving enough
>     info to
>     fill in everything you want to provide - we could probably provide it.
>
>     As far as I understand, you want to eventually upstream your work. In
>     this case I encourage you to start as early as possible (i.e.,
>     even if
>     it's an empty implementation that emits empty files), by posting
>     early
>     prototypes on our Phabricator and then adding incremental patches
>     on top
>     of it, rather than wait until your code is finished. Essentially,
>     LLVM
>     development policy promotes run-time flags as branches and
>     discourages
>     huge pull-requests from distant forks because otherwise it's
>     relatively
>     easy to take a wrong turn. We'll be able to consult you on what do
>     all
>     these notes and events mean or on other stuff of ours. There have
>     been
>     recent changes in how consumers are handled, so please make sure you
>     work with a recent clang.
>
>
>     On 9/17/18 10:51 AM, Paul Anderson via cfe-dev wrote:
>     > All:
>     >
>     > This is my first post to this list, so first, let me give a quick
>     > introduction. I'm VP of Engineering at GrammaTech, where I am in
>     > charge of an advanced static analysis tool named CodeSonar. It
>     > primarily works for C and C++, but also for x86, x64 and ARM
>     binaries.
>     > There is a little overlap with what CSA does, but CodeSonar's
>     strength
>     > is in whole-program path-sensitive analysis for serious defects and
>     > security vulnerabilities.
>     >
>     > I'm writing to let the community know of some work we will be doing
>     > that should benefit everyone. I think I know the best way
>     forward, but
>     > I'd appreciate any words of wisdom and feedback on our approach.
>     >
>     > This work is funded by a government research project aimed at
>     > modernizing open source static analysis tools. The project is named
>     > STAMP (the official funding agency page, which is admittedly very
>     > short on details, is here:
>     > https://www.dhs.gov/science-and-technology/csd-stamp.)
>     >
>     > There are several thrusts, but the piece I have been working on is
>     > aimed at changing tools so that they can communicate more
>     effectively
>     > with each other. Ultimately there will be a protocol to allow
>     tools to
>     > exchange information actively, but the first part is simpler and
>     > fairly straightforward. We will be modifying tools so that they can
>     > output results in SARIF, a standard output format for static
>     analysis
>     > tools:
>     >
>     https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=sarif.
>     The
>     > standard was first conceived at Microsoft. I'm on the TC, along
>     with
>     > representatives from other tool vendors and interested users.
>     >
>     > We've already written an adapter for CSA that can take plist-format
>     > output and convert it to SARIF, and we plan to make that available
>     > shortly. However due to constraints on what is expressible with
>     that
>     > format, we feel we can do a much better job if we change the
>     analyzer
>     > to output SARIF natively, controlled by (say)
>     -analyzer-output=sarif.
>     >
>     > We've done some prototyping of this on a fork and have it
>     rolling over
>     > nicely. There's more to be done though before we are ready to
>     submit
>     > anything for review. We've read all the material on contributing
>     and
>     > will follow those guidelines as best we can. However, if anyone can
>     > think of a reason why we should do anything differently, or if
>     there
>     > are particular pitfalls we should be aware of, I would greatly
>     > appreciate that input.
>     >
>     > Thanks in advance,
>     >
>     > -Paul
>     >
>     >
>
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

-- 
Paul Anderson, VP of Engineering, GrammaTech, Inc.
531 Esty St., Ithaca, NY 14850
Tel: +1 607 273-7340 x118; http://www.grammatech.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180918/fd92857e/attachment.html>