[cfe-dev] [analyzer] Regression testing for the static analyzer

Thu Jun 11 10:50:32 PDT 2020

11.06.2020 8:13 PM, Kristóf Umann via cfe-dev пишет:
> +Ericssson gang
>
> Endre and Gábor Márton in particular worked a lot of builtbots (CTU 
> related ones in particular), so I wouldn't risk summarizing our 
> current stance/progress on this issue.
>
> What I will say however from my perspective is that I find 
> committing stressful for all the reasons you mentioned. While I do my 
> best to contribute non-breaking code, the tedious process of jumping 
> on the company VPN, finding the appropriate server that isn't under 
> heavy load to run an analysis that is thorough enough sometimes leaves 
> me to commit seemingly miscellaneous patches after only running 
> check-clang-analysis, which on occasions comes to bite back. Things 
> like changes in the report count (in drastic cases changes in the bug 
> reports themselves, such as new notes), side effects on other 
> platforms, etc. makes this process really error prone as well, not to 
> mention that its at the point where I'm just itching to commit and 
> move on. While the responsibility of the committed or 
> soon-to-be-commited code still falls on the contributor, the lack of 
> builbots on a variety of platforms still makes this process very 
> inconvenient and downright hostile to non-regulars. Not to mention the 
> case where I fill the role of the reviewer.
>
> All in all, I really appreciate this project and agree strongly 
> with your goals!
>
> On Thu, 11 Jun 2020 at 17:51, Gábor Horváth via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     Hi!
>
>     I'm glad that someone picked this up. Making it easier to test the
>     analyzer on real-world topics is an important task that can
>     ultimately make it much easier to contribute to the analyzer.
>     See some of my comments inline.
>
>     On Thu, 11 Jun 2020 at 16:23, Valeriy Savchenko via cfe-dev
>     <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>
>         Person has to find at least a couple of projects, build them
>         natively, and check
>         with the analyzer. ... It should be dead simple, maybe as
>         simple as running
>         `lit` tests.
>
>
>     While I think this is a great idea we also should not forget that
>     the tested projects should exercise the right parts of the
>     analyzer. For instance, a patch adding exception support should be
>     tested on projects that are using exceptions extensively. Having a
>     static set of projects will not solve this problem. Nevertheless,
>     this is something that is far less important to solve. First, we
>     need something that is very close to what you proposed.
>
>
>         Another point that of interest, is reproducibility.
>
>
>     Huge +1. Actually, I'd be even glad to see more extremes like
>     running the analyzer multiple times making sure that the number of
>     exploded graphs and other statistics are stable to avoid
>     introducing non-deterministic behavior.
>

This one's not just about nondeterminism, it's also about 
reproducibility across machines with different systems and system 
headers. Like you'll be able to say "hey we broke something in our 
docker tests, take a look" and you'll no longer need to extract and send 
to me a preprocessed file. That's a lot if we try to collectively keep 
an eye on the effects of our changes on a single benchmark (or even if 
you have your own benchmark it's easy to share the project config 
because it's basically just a link to the project on github).

>         Short summary of what is there:
>           * Info on 15 open-source projects to analyze, most of which
>         are pretty small
>           * Dockerfile with fixed versions of dependencies for these
>         projects 
>
>
>     Dependencies are the bane of C++ at the moment. I'd love to see
>     some other solutions for this problem. Some of them coming to my mind:
>     * Piggy backing on the source repositories of linux distributions.
>     We could easily install all the build dependencies using the
>     package manager automatically. The user would only need to specify
>     the name of the source package, the rest could be automated
>     without having to manually search for the names of the dependent
>     packages.
>     * Supporting C++ package managers. There is Conan, vcpkg and some
>     CMake based. We could use a base docker image that already has
>     these installed.
>

Just curious, given that it's debian under the hood, can we replace our 
make scripts with "scan-build apt-build" or something like that?

>         The system has two dependencies: python (2 or 3) and docker.
>
>
>     How long do we want to retain Python 2 compatibility? I'm all in
>     favor of not supporting it for long (or at all).
>

As far as i understand we've still not "officially" transitioned to 
python3 in llvm. I don't think it actually matters for these scripts; 
it's not like they're run every day on an ancient buildbot that still 
doesn't have python3 (in fact as of now i don't think anybody uses them 
at all except us) but it sounds like in any case the only script that 
really needs to be python2 up to all possible formal requirements is 
`SATest.py` itself which is a trivial wrapper that parses some arguments 
and forwards them into docker; for everything else there's docker and 
you don't care what's within it.

>
>         (I am not a `csa-testbench` user, so please correct me if I'm
>         wrong here)
>
>
>     Your assessment is 100% correct here. We always wanted to add
>     docker support and support for rebuilding source deb packages to
>     solve most of the issues you mentioned.
>
>
>           * I want it to cover all basic needs of the developer:
>               - analyze a bunch of projects and show results
>               - compare two given revisions
>               - benchmark and compare performance
>
>
>     I think one very important feature is to collect/compare not only
>     the analysis results but more fine-grained information like the
>     statistics emitted by the analyzer (number of refuted reports in
>     case of refutation, number of exploded nodes, and so on).
>     It would be nice to be able to retrieve anything crash-related
>     like call stacks and have an easy way to ssh into the docker image
>     to debug the crash within the image.
>     Also, the csa-testbench has a feature to define regular
>     expressions and collect the matching lines of the analyzer output.
>     This can be useful to count/collect log messages.
>
>
>           * I want all commands to be as simple as possible, e.g.:
>
>
>     While I see the value of having a minimal interface I wonder if it
>     will be a bit limiting to the power users in the end (see
>     extracting statistics and logs based on regexp).,
>

I think it's totally worth it to have both. When a newcomer tries to 
test their first checker there's nothing better than a simple one-liner 
that we can tell them to run. But on the other hand having fine-grained 
commands for controlling every step of the process is absolutely 
empowering and not going anywhere.

>           * Would you use a system like this?
>
>
>     In the case, it supports my needs, definitely. As you mentioned,
>     there are multiple contenders here: csa-testbench and SATest. I do
>     see why the testbench is not desirable (mainly because of the
>     dependencies), but I wonder if it would make sense to have
>     compatible configurations. I.e. one could copy and paste a project
>     from one to the other have it working without any additional efforts.
>
>
>           * Does the proposed solution seem reasonable in this situation?
>
>
>     Looks good to me.
>
>
>           * What do you think about the directions?
>
>
>     +1
>
>
>           * What other features do you want to see in the system?
>
>
>     See my other inlines above.
>
>
>           * What are the priorities for the project and what is the
>         minimal feature
>             scope to start using it?
>
>
>     If we can run it reliably on big projects I'd say have a built bot
>     as soon as possible (that only triggers when crashes are
>     introduced). I think it could have prevented many errors.
>
>
>         Thank you for taking your time and reading through this!
>
>         _______________________________________________
>         cfe-dev mailing list
>         cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>         https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200611/bc11e3cd/attachment-0001.html>