[cfe-dev] [analyzer] Project to output SARIF

Paul Anderson via cfe-dev cfe-dev at lists.llvm.org
Tue Sep 18 14:29:39 PDT 2018


Artem:

Thanks for your email, and nice to meet you.


On 9/17/2018 5:36 PM, Artem Dergachev wrote:
> Hmm, this looks useful. I'd love to see how we perform compared to 
> other tools and have a look at interesting false negatives that such 
> comparison would be able to find, though i understand that this sort 
> of comparisons are hard because different tools may report the same 
> bug in different manners, on different lines of code, with different 
> warnings and notes, so even if they provide it in the same format, 
> matching them to each other automatically may be hard.
It's relatively easy to compare tools in terms of their ability to 
detect superficial properties, but it gets very difficult for checkers 
that reason about the semantics of the code.
>
> Analyzer outputs are implemented by PathDiagnosticConsumer 
> sub-classes, and it should be fairly straightforward to add a new 
> sub-class. You need to handle different "diagnostic pieces" (events 
> along the path, directions on how does the path run through the 
> program, etc.) Please let us know if you think that the class is not 
> receiving enough info to fill in everything you want to provide - we 
> could probably provide it.
That's exactly what we did. Most new code is in a new file named 
SarifDiagnostics.cpp.
>
> As far as I understand, you want to eventually upstream your work. In 
> this case I encourage you to start as early as possible (i.e., even if 
> it's an empty implementation that emits empty files), by posting early 
> prototypes on our Phabricator and then adding incremental patches on 
> top of it, rather than wait until your code is finished. Essentially, 
> LLVM development policy promotes run-time flags as branches and 
> discourages huge pull-requests from distant forks because otherwise 
> it's relatively easy to take a wrong turn. We'll be able to consult 
> you on what do all these notes and events mean or on other stuff of 
> ours. There have been recent changes in how consumers are handled, so 
> please make sure you work with a recent clang.
The immediate short-term goal will be to bring things up to date from 
about two months ago. Hopefully that won't take too long. I'll then 
start that review.

-Paul
>
>
> On 9/17/18 10:51 AM, Paul Anderson via cfe-dev wrote:
>> All:
>>
>> This is my first post to this list, so first, let me give a quick 
>> introduction. I'm VP of Engineering at GrammaTech, where I am in 
>> charge of an advanced static analysis tool named CodeSonar. It 
>> primarily works for C and C++, but also for x86, x64 and ARM 
>> binaries. There is a little overlap with what CSA does, but 
>> CodeSonar's strength is in whole-program path-sensitive analysis for 
>> serious defects and security vulnerabilities.
>>
>> I'm writing to let the community know of some work we will be doing 
>> that should benefit everyone. I think I know the best way forward, 
>> but I'd appreciate any words of wisdom and feedback on our approach.
>>
>> This work is funded by a government research project aimed at 
>> modernizing open source static analysis tools. The project is named 
>> STAMP (the official funding agency page, which is admittedly very 
>> short on details, is here: 
>> https://www.dhs.gov/science-and-technology/csd-stamp.)
>>
>> There are several thrusts, but the piece I have been working on is 
>> aimed at changing tools so that they can communicate more effectively 
>> with each other. Ultimately there will be a protocol to allow tools 
>> to exchange information actively, but the first part is simpler and 
>> fairly straightforward. We will be modifying tools so that they can 
>> output results in SARIF, a standard output format for static analysis 
>> tools: 
>> https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=sarif. 
>> The standard was first conceived at Microsoft. I'm on the TC, along 
>> with representatives from other tool vendors and interested users.
>>
>> We've already written an adapter for CSA that can take plist-format 
>> output and convert it to SARIF, and we plan to make that available 
>> shortly. However due to constraints on what is expressible with that 
>> format, we feel we can do a much better job if we change the analyzer 
>> to output SARIF natively, controlled by (say) -analyzer-output=sarif.
>>
>> We've done some prototyping of this on a fork and have it rolling 
>> over nicely. There's more to be done though before we are ready to 
>> submit anything for review. We've read all the material on 
>> contributing and will follow those guidelines as best we can. 
>> However, if anyone can think of a reason why we should do anything 
>> differently, or if there are particular pitfalls we should be aware 
>> of, I would greatly appreciate that input.
>>
>> Thanks in advance,
>>
>> -Paul
>>
>>
>

-- 
Paul Anderson, VP of Engineering, GrammaTech, Inc.
531 Esty St., Ithaca, NY 14850
Tel: +1 607 273-7340 x118; http://www.grammatech.com




More information about the cfe-dev mailing list