[cfe-dev] Proposal: Integrate static analysis test suites

Wed Jan 27 23:50:10 PST 2016

>
> This is by design. Many more people have compiler as part of their daily
> flow so it’s best to have such errors being reported by the compiler.
> Having the analyzer produce all of the compiler warnings is likely to be
> too nosy for the users.
>
(no*i*sy)

Ahh, makes sense. 'Twas a quirk of my workflow.

Have you considered adding the tests to be tested on the additional
> analyzer build bot (
> http://lab.llvm.org:8080/green/job/StaticAnalyzerBenchmarks/) instead of
> adding them to lit? (I’ve suggested that earlier.)
>

I did, but I naively assumed that the buildbot ran some form of lit under
the hood. The README for buildbot testsuite looks helpful :)

Sincerely,
Alexander Riccio
--
"Change the world or go home."
about.me/ariccio

<http://about.me/ariccio>
If left to my own devices, I will build more.
⁂

On Thu, Jan 28, 2016 at 2:31 AM, Anna Zaks <ganna at apple.com> wrote:

>
> On Jan 27, 2016, at 11:16 PM, <Alexander G. Riccio> <test35965 at gmail.com>
> wrote:
>
>
>
>> This is not surprising, the static analyzer does not catch buffer
>> overflows. We do have an experimental checker for it but it is not very
>> strong.
>
>
> Personally, I think detecting stack overruns is a very valuable capability
> of a static analysis tool. Getting Clang to detect this issue with the
> default options should be a high priority.
>
> One of the main issues is that the solver we use does not reason about
>> relational constraints that involve 2 symbols (ex: i < n).
>
>
> Stepping through the checker code over the past couple days, I can see
> this: it appears to "brute force" array accesses, evaluating loop
> conditions every single time. When checking the attached minimized case,
> enabling only the "-analyzer-checker=alpha.security.ArrayBoundV2", Clang
> seemed to bail out at around the third access. That makes sense, as the
> default value for analyzer-max-loop is 4. Indeed, if I bump
> "analyzer-max-loop" up 11 (no pun intended), then Clang catches the issue.
> Console command line attached.
>
> As I think you alluded to, this sort of checking is best done
> "algebraically" instead of with brute force. I'm not really sure how to
> implement that sort of "algebraic" checker, but I've been curious about the
> problem for several years now. By discovering Clang's weaknesses in static
> analysis, and subsequently fixing them, I'll learn exactly that. I usually
> learn best when I learn "the hard way", and this seems like the perfect
> opportunity.
>
> Side note: during any of the optimization passes does Clang/LLVM do any
> kind of loop bound access analysis? Perhaps we can use that info to
> evaluate relational constraints that govern array access?
>
> Would those be caught with compiler warnings? (Try running clang on them
>> with -Weverything.)
>>
>
> Actually, they do seem to be caught when compiler warnings are turned
> on... but they ONLY warn when NOT passing --analyze? Huh?
>
>
> This is by design. Many more people have compiler as part of their daily
> flow so it’s best to have such errors being reported by the compiler.
> Having the analyzer produce all of the compiler warnings is likely to be
> too nosy for the users.
>
>
> I think diagnosing format string misuse during normal compilation is a
> fantastic idea - MSVC until 2015 required that you run /analyze, which very
> few people actually do
> <https://www.youtube.com/watch?v=4zgYG-_ha28&feature=player_detailpage#t=57m42s>*
> - but I'm not used to the idea that I can't run analysis at the same time...
>
> In the short term:
>
> Now that I'm a bit familiar with the codebase I expect to finish manually
> running the SAMATE tests in the next couple days, and start work on getting
> them to run under LIT after that.
>
>
> Have you considered adding the tests to be tested on the additional
> analyzer build bot (
> http://lab.llvm.org:8080/green/job/StaticAnalyzerBenchmarks/) instead of
> adding them to lit? (I’ve suggested that earlier.)
>
> *back then it was only for Xbox 360 devs, now it's in all editions of
> Visual Studio
>
> Sincerely,
> Alexander Riccio
> --
> "Change the world or go home."
> about.me/ariccio
>
> <http://about.me/ariccio>
> If left to my own devices, I will build more.
> ⁂
>
> On Mon, Jan 25, 2016 at 11:48 AM, Anna Zaks <ganna at apple.com> wrote:
>
>>
>> On Jan 24, 2016, at 12:58 AM, <Alexander G. Riccio> <test35965 at gmail.com>
>> wrote:
>>
>> Since that patch landed, I've manually run ~30 of the SAMATE/SARD tests,
>> and so far, Clang has missed 5 stack buffer overruns, 4 heap buffer
>> overruns,
>>
>>
>> This is not surprising, the static analyzer does not catch buffer
>> overflows. We do have an experimental checker for it but it is not very
>> strong. One of the main issues is that the solver we use does not reason
>> about relational constraints that involve 2 symbols (ex: i < n).
>>
>> and a couple of format string issues.
>>
>>
>> Would those be caught with compiler warnings? (Try running clang on them
>> with -Weverything.)
>>
>> Clang seems a bit better with double-free/use-after-free issues, and leak
>> issues.
>>
>> So it looks like there's some good stuff here, and we'll have a pretty
>> specific set of things to work on!
>>
>>
>> Thanks!
>> Anna.
>>
>>
>> Pretty cool, eh?
>>
>> Sincerely,
>> Alexander Riccio
>> --
>> "Change the world or go home."
>> about.me/ariccio
>>
>> <http://about.me/ariccio>
>> If left to my own devices, I will build more.
>> ⁂
>>
>> On Wed, Jan 20, 2016 at 1:11 AM, <Alexander G. Riccio> <
>> test35965 at gmail.com> wrote:
>>
>>> A quick update on this project:
>>>
>>> I've been slowed by a technical issue, and I lost ~2 weeks as two family
>>> members were in the hospital (sorry!).
>>>
>>> Since I develop on Windows, I quickly hit a testcase that clang didn't
>>> detect, as I discussed in "Clang on Windows fails to detect trivial
>>> double free in static analysis".
>>>
>>> That resulted in D16245 <http://reviews.llvm.org/D16245>, which (when
>>> accepted) fixes that issue. I want to ensure that novice can simply pass
>>> "--analyze", and clang to "just work", so I've intentionally put off
>>> further testing work. Otherwise, I could hack around it, and subsequently
>>> forget about the workaround. Once that's dealt with, then I can resume work
>>> at a faster pace.
>>>
>>>
>>>
>>> Sincerely,
>>> Alexander Riccio
>>> --
>>> "Change the world or go home."
>>> about.me/ariccio
>>>
>>> <http://about.me/ariccio>
>>> If left to my own devices, I will build more.
>>> ⁂
>>>
>>> On Mon, Jan 4, 2016 at 3:05 PM, Anna Zaks <ganna at apple.com> wrote:
>>>
>>>>
>>>> On Jan 2, 2016, at 12:45 PM, <Alexander G. Riccio> <test35965 at gmail.com>
>>>> <Alexander G. Riccio> wrote:
>>>>
>>>> Devin has started writing scripts for running additional analyzer tests
>>>>> as described in this thread:
>>>>>
>>>>
>>>> A buildbot sounds like the perfect idea!
>>>>
>>>> The idea was to check out the tests/projects from the existing repos
>>>>> instead of copying them. Would it be possible to do the same with these
>>>>> tests?
>>>>>
>>>>
>>>> Eh? What do you mean? Would that stop someone from running them in the
>>>> clang unit test infrastructure?
>>>>
>>>> I believe that these tests WILL need to be modified to run in the Clang
>>>> testing infrastructure.
>>>>
>>>>
>>>> Currently, the analyzer is only tested with the regression tests.
>>>> However, those need to be fast (since they effect all clang developers) and
>>>> they have limited coverage. Internally, we’ve been testing the analyzer
>>>> with the test scripts Devin described in the email I referenced. We use
>>>> that testing method to analyze whole projects and long running tests. Those
>>>> tests can and should be executed separately as they take more than an hour
>>>> to complete. The plan is to set up an external builedbot running those
>>>> tests.
>>>>
>>>> How long would it take to analyze the tests you are planning to add?
>>>> Depending on the answer to that question, adding your tests to the new
>>>> builedbot might be a better fit than adding them to the regression tests.
>>>>
>>>> I also prefer not to modify the externally written tests since it would
>>>> allow us to update them more easily, for example, when a new version of the
>>>> tests comes out.
>>>>
>>>> Is there any way to treat static analyzer warnings as plain old
>>>> warnings/errors? Dumping them to a plist file from a command line
>>>> compilation is a bit annoying, and I think is incompatible with the clang
>>>> unit testing infrastructure?
>>>>
>>>>
>>>> Plist output is one if the outputs that the clang static analyzer
>>>> supports. It is a much richer format than the textual warning since it
>>>> contains information about the path on which the error occurred. We did
>>>> have some lit tests checking plist output as well.
>>>>
>>>>
>>>>
>>>> Sincerely,
>>>> Alexander Riccio
>>>> --
>>>> "Change the world or go home."
>>>> about.me/ariccio
>>>>
>>>> <http://about.me/ariccio>
>>>> If left to my own devices, I will build more.
>>>> ⁂
>>>>
>>>> On Mon, Dec 28, 2015 at 12:23 AM, Anna Zaks <ganna at apple.com> wrote:
>>>>
>>>>>
>>>>> On Dec 17, 2015, at 11:01 AM, <Alexander G. Riccio> <
>>>>> alexander at riccio.com> <Alexander G. Riccio> wrote:
>>>>>
>>>>> However, if the goal is to have the tests
>>>>>> because you would like to make efforts to have the compiler diagnose
>>>>>> their cases properly, that's far more interesting and a good reason to
>>>>>> bring in the tests.
>>>>>
>>>>>
>>>>> That's exactly my intention. Improving the static analyzer to detect
>>>>> these cases, that will be interesting.
>>>>> placeholder text
>>>>>
>>>>> If the other tests are not clearly licensed, we
>>>>>> should try to get NIST to clarify the license of them before
>>>>>> inclusion.
>>>>>
>>>>>
>>>>> That sounds like the best idea, as a government agency, they almost
>>>>> certainly have lawyers.
>>>>>
>>>>> I think the next step is to integrate the working (error correctly
>>>>> diagnosed) tests, only those that are obviously in the public domain, and
>>>>> propose them as a big batched patch. This shouldn't itself be
>>>>> controversial.
>>>>>
>>>>> How exactly do I submit a patch? I see that the LLVM developer policy
>>>>> says to send it to the mailing list (cfe-commits), but I also see that Phabricator
>>>>> comes into this somewhere
>>>>> <http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20151214/145026.html>
>>>>> ?
>>>>>
>>>>>
>>>>> Devin has started writing scripts for running additional analyzer
>>>>> tests as described in this thread:
>>>>>
>>>>> http://clang-developers.42468.n3.nabble.com/analyzer-Adding-build-bot-for-static-analyzer-reference-results-td4047770.html
>>>>>
>>>>> The idea was to check out the tests/projects from the existing repos
>>>>> instead of copying them. Would it be possible to do the same with these
>>>>> tests?
>>>>>
>>>>> Sorry for not replying sooner!
>>>>> Anna.
>>>>>
>>>>> Sincerely,
>>>>> Alexander Riccio
>>>>> --
>>>>> "Change the world or go home."
>>>>> about.me/ariccio
>>>>>
>>>>> <http://about.me/ariccio>
>>>>> If left to my own devices, I will build more.
>>>>> ⁂
>>>>>
>>>>> On Thu, Dec 10, 2015 at 4:04 PM, Aaron Ballman <aaron at aaronballman.com
>>>>> > wrote:
>>>>>
>>>>>> On Mon, Dec 7, 2015 at 9:50 PM, <Alexander G. Riccio> via cfe-dev
>>>>>> <cfe-dev at lists.llvm.org> wrote:
>>>>>> > First time Clang contributor here,
>>>>>> >
>>>>>> > I'd like to add the "C Test Suite for Source Code Analyzer v2", a
>>>>>> > relatively small test suite (102 cases/flaws), some of which Clang
>>>>>> > doesn't yet detect*. See link at bottom.
>>>>>> >
>>>>>> > Immediate questions:
>>>>>> > 0. Does the Clang community/project like the idea?
>>>>>>
>>>>>> I've included a few other devs (CCed) to get further opinions.
>>>>>>
>>>>>> I like the idea of being able to diagnose the issues covered by the
>>>>>> test suite, but I don't think including the test suite by itself is
>>>>>> particularly useful without that goal in mind. Also, one question I
>>>>>> would have has to do with the licensing of the tests themselves and
>>>>>> whether we would need to do anything special there.
>>>>>>
>>>>>> > 1. What's the procedure for including new tests? (not the technical,
>>>>>> > but the community/project).
>>>>>>
>>>>>> Getting the discussion going about the desired goal (as you are doing)
>>>>>> is the right first step.
>>>>>>
>>>>>> > 2. How do I include failing tests without breaking things? Some of
>>>>>> > these tests will fail - that's why I'm proposing their inclusion -
>>>>>> but
>>>>>> > they shouldn't yet cause the regression testing system to complain.
>>>>>>
>>>>>> Agreed, any test cases that are failing would have to fail gracefully.
>>>>>> I assume that by failure, you mean "should diagnose in some way, but
>>>>>> currently does not". I would probably split the tests into two types:
>>>>>> one set of tests that properly diagnose the issue (can be checked with
>>>>>> FileCheck or -verify, depending on the kind of tests we're talking
>>>>>> about), and one set of tests where we do not diagnose, but want to see
>>>>>> them someday (which can be tested with expect-no-diagnostics, for
>>>>>> example). This way, we can ensure test cases continue to diagnose when
>>>>>> we want them to, and we can be alerted when new diagnostics start to
>>>>>> catch previously uncaught tests. This is assuming that it makes sense
>>>>>> to include all of the tests at once, which may not make sense in
>>>>>> practice.
>>>>>>
>>>>>> > 3. How does Clang handle licensing of third party code? Some of
>>>>>> these
>>>>>> > tests are clearly in the public domain (developed at NIST, says "in
>>>>>> > the public domain"), but others are less clearly licensed.
>>>>>>
>>>>>> Oh look, you asked the same question I asked. ;-) If the tests are in
>>>>>> the public domain and clearly state as such, I think we can go ahead
>>>>>> and include them. If the other tests are not clearly licensed, we
>>>>>> should try to get NIST to clarify the license of them before
>>>>>> inclusion. Depending on the license, we may be able to include them
>>>>>> under their original license. If we cannot clarify the license, I
>>>>>> would guess that we simply should not include those tests as part of
>>>>>> our test suite. Note: I could be totally wrong, IANAL. :-)
>>>>>>
>>>>>> > Should the community accept that testsuite, and I successfully add
>>>>>> > that test suite, then I'd like to step it up a bit, and include the
>>>>>> > "Juliet Test Suite for C/C++". "Juliet" is a huge test suite by the
>>>>>> > NSA Center for Assured Software & NIST's Software Assurance Metrics
>>>>>> > And Tool Evaluation project, which has 25,477 test cases (!!) for
>>>>>> 118
>>>>>> > CWEs. I don't think any other open source compiler could compete
>>>>>> with
>>>>>> > Clang after this. There's a ton of literature on the "Juliet" suite,
>>>>>> > and listing it here is not necessary.
>>>>>> >
>>>>>> > This project would be my first Clang contribution :)
>>>>>> >
>>>>>> > Personally, I'm interested in static analysis, and this is the first
>>>>>> > step in understanding & improving Clang's static analysis
>>>>>> > capabilities.
>>>>>> >
>>>>>> > I have some ideas on how to detect the currently undetected bugs,
>>>>>> and
>>>>>> > I'm curious to see where things lead.
>>>>>>
>>>>>> Adding the tests by themselves is not necessarily interesting to the
>>>>>> project unless they exercise the compiler in ways it's not currently
>>>>>> being exercised. So just having tests for the sake of having the tests
>>>>>> is not too useful (IMO). However, if the goal is to have the tests
>>>>>> because you would like to make efforts to have the compiler diagnose
>>>>>> their cases properly, that's far more interesting and a good reason to
>>>>>> bring in the tests.
>>>>>>
>>>>>> One possible approach if you are interested in having the compiler
>>>>>> diagnose the cases is to bring the tests in one at a time. Start with
>>>>>> the initial batch of "these are diagnosed properly", then move on to
>>>>>> "this test is diagnosed properly because of this patch." Eventually
>>>>>> we'll get to the stage where all of the tests are diagnosed properly.
>>>>>>
>>>>>> > Secondary questions:
>>>>>> > 1. How should I break the new tests up into patches? Should I just
>>>>>> > whack the whole 102 case suite into a single patch, or a bunch of
>>>>>> > smaller ones?
>>>>>>
>>>>>> See comments above.
>>>>>>
>>>>>> > 2. How does the Clang/LLVM static analysis testing infrastructure
>>>>>> > work? I'm going to have to figure this out myself anyways, but where
>>>>>> > should I start? Any tips on adding new tests?
>>>>>>
>>>>>> http://clang-analyzer.llvm.org/checker_dev_manual.html
>>>>>>
>>>>>> Another good place for some of these checkers may be clang-tidy, or
>>>>>> the compiler frontend itself. It's likely to depend on case-by-case
>>>>>> code patterns.
>>>>>>
>>>>>> http://clang.llvm.org/extra/clang-tidy/
>>>>>>
>>>>>> Thank you for looking into this!
>>>>>>
>>>>>> ~Aaron
>>>>>>
>>>>>> >
>>>>>> > *If I remember correctly,
>>>>>> > https://samate.nist.gov/SRD/view_testcase.php?tID=149055 passes
>>>>>> > analysis without complaint. I manually spot checked a very small
>>>>>> > number of tests.
>>>>>> >
>>>>>> > "C Test Suite for Source Code Analyzer v2" (valid code):
>>>>>> > https://samate.nist.gov/SRD/view.php?tsID=101
>>>>>> > "C Test Suite for Source Code Analyzer v2" (invalid code):
>>>>>> > https://samate.nist.gov/SRD/view.php?tsID=100
>>>>>> >
>>>>>> > "Juliet Test Suite for C/C++" (files):
>>>>>> >
>>>>>> https://samate.nist.gov/SRD/testsuites/juliet/Juliet_Test_Suite_v1.2_for_C_Cpp.zip
>>>>>> > "Juliet Test Suite for C/C++" (docs):
>>>>>> >
>>>>>> https://samate.nist.gov/SRD/resources/Juliet_Test_Suite_v1.2_for_C_Cpp_-_User_Guide.pdf
>>>>>> >
>>>>>> >
>>>>>> > Sincerely,
>>>>>> > Alexander Riccio
>>>>>> > _______________________________________________
>>>>>> > cfe-dev mailing list
>>>>>> > cfe-dev at lists.llvm.org
>>>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> <minimized-stack_overflow-bad_commandline.txt>
> <minimized-stack_overflow-bad.c>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160128/bbe600e4/attachment.html>