[cfe-dev] codechecker into clang/LLVM?

Mon Dec 14 11:44:53 PST 2015

> On Dec 14, 2015, at 1:43 AM, Dániel Krupp <daniel.krupp at ericsson.com> wrote:
> 
> Hi,
> Answers in blue.
> 
> Regards,
> Daniel
>  
> From: Anna Zaks [mailto:ganna at apple.com <mailto:ganna at apple.com>] 
> Sent: 2015. december 11. 23:52
> To: Dániel Krupp
> Cc: cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> Subject: Re: [cfe-dev] codechecker into clang/LLVM?
>  
>  
> On Dec 11, 2015, at 1:51 AM, Dániel Krupp <daniel.krupp at ericsson.com <mailto:daniel.krupp at ericsson.com>> wrote:
>  
> Hi Anna,
>  
> Please find my comments below in red.
> We will start integration into lit auto-test integration soon.
> 
> Regards,
> Daniel
>  
>  
>  
> From: Anna Zaks [mailto:ganna at apple.com <mailto:ganna at apple.com>] 
> Sent: 2015. december 8. 20:32
> To: Dániel Krupp
> Cc: cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> Subject: Re: [cfe-dev] codechecker into clang/LLVM?
>  
> Hi Daniel,
> 
> I’ve looked at the project in a bit more detail and here are some other general comments.
> 
> It is very desirable to allow CodeChecker to work with scan-build (the current one and/or the python rewrite). This would ensure that all projects that we can analyze now (on all platforms such as OS X and Windows) will be supported. We agree that certain parts of current scan-build could be improved; however, getting a modular design will allow us to stage that process. Keeping code checker build interposition separate from the current scan-build would cause fragmentation between the users and developers. Is it possible the integrate CodeChecker with scan-build?
> DK> Yes it is. I think the best would be to add an option to CodeChecker to be able to parse the new scan-build output directory.
>  
> Just to clarify, it would only work with the "new scan-build" since CodeChecker takes the compilation database as input, correct?
> DK> CodeChecker can take the compilation db as the input and invoke clang even today (CodeChecker check -l compilation_db …), so in this sense it works well together with scan-build.
> However we could implement an option to take the output directory of the new-scan build with the plist files as the input. This would work better for projects that remove source files during build. 

So this would work with the existing scan-build, correct? I would really like to have this option supported; this would also allow us to test the part of the tool that does not deal with interposition even now on OS X.

>  
> Later, we can replace codechecker’s build logger with the new scan-build after we made sure that scan-build supports everything we need (for example executing clang-tidy).
> But for short term parsing scan-build’s output should be good enough.
>  
>  
> Is it possible to have something like "quick check" mode but with the enhanced HTML viewer? The database is needed for bug tracking, but requiring users to set it up to view bugs does not seem necessary. The easier it is to run/setup the tool for basic usage the more people will use it!
> DK>  It is possible already using sqlite db.
>  
> Could you explain why the database is needed at all? Why the overall design requires it?
> DK> For analyzing a project with a few files you would not need a db. However we designed it for the use case when you analyze a project with a million lines of code and with potentially thousands of analyzer reports. Imagine that you analyze and store many version of this project each with say 5000 reports (with bug path). Then let’s say you would like to see the new bugs and resolved bugs between 2 such versions (which now became possible with the bug hashes), or you want to filter/order the results based on fault priority/file path. For these tasks you need a DB backend with indexing. Does that make sense?
> 

Yes, I understand why the database is needed in some workflows. My question is about whether it is possible to use the tool without a database. For example, a user may want to scan the project and see the results as a way of learning more about the static analyzer. Or a user might keep their project clean of the static analyzer issues and scan it from time to time. Basically, any workflow that scan-build supports right now would benefit from having a better report viewer, which code checker provides.

> 
> Setting up postgresql for a quick analysis is cumbersome,
> therefore we added sqlite support (based on earlier suggestion from Laszlo Nagy).
>  
> You can run the check and the server like this
> CodeChecker check -n test --sqlite -w /tmp/workspace -b "make"
> CodeChecker server --sqlite -w /tmp/workspace -v 6060 --not-host-only
>  
> So you can run this quickly without postgres config, suitable for basic usage.
> 
> We’d need to integrate the automated tests into lit and add documentation.
> DK> Sure, we will integrate auto-tests into lit. Documentation (user-guide) is already available in markdown: https://github.com/Ericsson/codechecker/blob/master/docs/user_guide.md <https://github.com/Ericsson/codechecker/blob/master/docs/user_guide.md>
> 
> Suppression in code should be handled by the compiler/analyzer, hopefully, using a more familiar syntax. (We can discuss this separately later.)
> DK> Would be nice to use the same suppression syntax also in clang-tidy.
>  
> I think we should discuss the design on a separate thread. I do not think it's a critical feature, so we could do this later. The main issue here is that I am not comfortable with recommending users a way for suppressing issues in code until the greater community agrees upon the design for issue suppression and compiler support is implemented. Once these recommendations are made, the users assume that the format will be supported going forward.
> DK> Unfortunately clang-sa reports many false positives for our projects. We spend considerable amount of time analyzing these reports selecting and marking false positives. Since the bug-hash format is still not stable, there is a risk that this invested work gets lost. So a stable in-code bug suppression would be very nice feature both for tidy and clang-sa, preferably according to a greater community agreement as you suggest.
>  
> 
> All command line options should be compatible/make sense in the potential future world of whole project analysis. (This might already be the case.)
>  
> Thank you,
> Anna.
> 
> On Nov 12, 2015, at 3:44 PM, Anna Zaks via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>  
> 
> On Nov 11, 2015, at 9:07 AM, Dániel Krupp <daniel.krupp at ericsson.com <mailto:daniel.krupp at ericsson.com>> wrote:
>  
> Hi Anna,
>  
> First, thanks for looking into this.
> We are open to any suggestions that you feel necessary to get this accepted to LLVM/Clang as a bug-tracking solution…
>  
> >- What would it take for this to replace scan-build?...
> We’ve been mainly targeting (I mean test it on ) Linux, but Mac and Windows support can be easily added too.
> Since the whole thing is in python, the only  issue here could be the “build interposition”, as you pointed out. Other than that, it’s pretty much platform independent.
>  
> Regarding the “bug interposition”: CodeChecker uses the standard clang JSON compilation database <http://clang.llvm.org/docs/JSONCompilationDatabase.html> format as an input which you can pass like this (CodeChecker check -l <build_log.json>).
>  
> Do we have to use the JSON compilation database or can this be made to work with the existing scan-build?
> 
> 
> 
> You can generate this log, by any other tools, using bear <https://github.com/rizsotto/Bear> tool for example on Mac.
>  
> CC-ing Laszlo who is working on a scan-build rewrite in Python. This is the rewrite I've mentioned in one of the previous emails. It is flexible so that it could use either build interposition (bear) or ccc-analyzer style build.
>  
> Ideally, we would have a component that would have the parity in capabilities with scan-build (or better). Laszlo has made a lot of progress on this. We could use that component for interposition and have a bug management system on top of it.
> 
> 
> 
> Currently the built in logger (based on LD_PRELOAD) supports compilation logging on Linux smoothly. We could add a (bash script) based logger too, that would be platform independent.
>  
> How would the bash script logger work?
> 
> 
> 
>  
> We can do some testing on windows, but any help testing on Mac is welcome (as we don’t use Macs).
>  
> > Is licensing compatible?
> It should be. Except for psycopg <http://initd.org/psycopg/license/> (the postgres database connector) we not relying on any GPL or LGPL stuff.
>  
> If psycopg is a problem this could be replaced to another postgres connector, such as pg8000 <https://pypi.python.org/pypi/pg8000> with BSD license.
>  
>  
> It is a problem. (I cannot test this until it's free of LGPL.) Good to hear that we can switch to using an alternate method.
>  
> I collected here the licenses of the dependencies. All dependencies are run-time dependencies (except for the thrift compiler) and are used without modification.
>  
> Javascript dependencies
> *codemirror (https://codemirror.net/ <https://codemirror.net/> MIT licence)
> *jsplumb (community edition, MIT https://jsplumbtoolkit.com/license#community <https://jsplumbtoolkit.com/license#community>)
> *marked (BSD like https://github.com/chjj/marked/blob/master/LICENSE <https://github.com/chjj/marked/blob/master/LICENSE>)
> *dojotoolkit (new BSD license https://dojotoolkit.org/license.html <https://dojotoolkit.org/license.html>)
>  
> Python dependencies
> ·         Python2 <https://www.python.org/> (> 2.7) (Python Software Foundation ) https://www.python.org/download/releases/2.7/license/ <https://www.python.org/download/releases/2.7/license/>)
> ·         Alembic <https://pypi.python.org/pypi/alembic> (>=0.8.2) (MIT)
> ·         SQLAlchemy <http://www.sqlalchemy.org/> (> 1.0.2) (MIT)
> ·         psycopg2 <http://initd.org/psycopg/> (> 2.5.4) (LGPL http://initd.org/psycopg/license/ <http://initd.org/psycopg/license/>)
> Other dependencies
> ·         Clang Static analyzer <http://clang-analyzer.llvm.org/> 
> Postgresql <http://www.postgresql.org/> (> 9.3.5) (BSD like http://www.postgresql.org/about/licence/ <http://www.postgresql.org/about/licence/>)
> ·         Thrift (compilation dependency) (Apache v2.0 https://thrift.apache.org/ <https://thrift.apache.org/>)
> ·         Bzip2 is used for test project only (can be removed) (BSD like http://www.bzip.org/ <http://www.bzip.org/>)
>  
>  
> Could you suggest which dependencies are problematic? We will investigate how to replace those.
>  
> I do not know if there are other licensing issues. They all look OK to me but I am not an expert.
>  
> Ideally, we'd also want to smooth out the installation process as much as possible.
>  
> Could you help in testing and/or making it Mac compatible? I suggest first running it on a standard JSON build db using the –l option.
>  
> Regards,
> Daniel
>  
>  
> From: ganna at apple.com <mailto:ganna at apple.com> [mailto:ganna at apple.com <mailto:ganna at apple.com>] 
> Sent: 2015. november 10. 19:20
> To: Dániel Krupp
> Cc: cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> Subject: Re: [cfe-dev] codechecker into clang/LLVM?
>  
> Hi Daniel,
>  
> Sorry for taking so long to reply!
>  
> The clang static analyzer is definitely missing a bug tracking system and I believe this project has a good potential to fill that need. Here are a couple of concerns that immediately jump into mind:
>  
> - What would it take for this to replace scan-build? Can scan-build be used instead of the interposition module you use? For example, can we control the build interposition method by some option and the bug tracking would be an add-on on top of that? I suspect that your solution does not work on all platforms that scan-build currently supports (Mac and Windows come to mind). That is the main concern here. There are also projects that might not build with the type of interposition you use. I am not sure if you are aware of the scan-build rewrite (in Python) effort, where all these issues were raised as well.
>  
> - Is licensing compatible? The llvm codebase tries to stay clear of any dependencies on GPL or LGPL licenses because there are companies who are involved with the project and cannot use software tainted with those licenses. 
>  
> - The list of dependencies is large, which is a concern if this was to replace scan-build.
>  
> Anna.
>  
> On Oct 22, 2015, at 7:14 AM, Dániel Krupp via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>  
> Hello All,
>  
> Scan-build, the current bug viewer Clang Static Analyzer front-end tool has some scalability issues and limitations.
> For example, scan-build creates static HTML reports, storing whole source files as many times as they are included in a report.
> Incremental bug reporting (show only new bugs compared to a baseline) and false positive suppression is not supported either.
>  
> To address these issues, back in July we published CodeChecker on GitHub ( https://github.com/Ericsson/codechecker <https://github.com/Ericsson/codechecker> ),
> a new defect storage and management infrastructure for Clang Static Analyzer (written in python). We also gave a talk about this in Euro LLVM 2015 (http://llvm.org/devmtg/2015-04/ <http://llvm.org/devmtg/2015-04/>).
>  
> The most important features are the following:
>      - scalable dynamic web based defect viewer (instead of static html)
>      - a new command line tool for analyzing projects which is usable in CI scripts
>      - a PostgreSQL based defect storage & management
>      - incremental bug reporting (show only new bugs compared to a baseline)
>      - suppression of false positives
>      - better integration with build systems (through the LD_PRELOAD mechanism)
>      - Apache Thrift API based server-client model for storing bugs and viewing results.
>      - It is possible to connect multiple bug viewers. Currently a web-based viewer and a command line viewer are provided.
>  
> Since its publication we have fixed many errors, addressed user-feedbacks and now I think it is mature enough.
>  
>  
> We could release the tool under LLVM license.
>  
> If you agree, this tool could be part of the llvm/clang source tree, possibly besides scan-build (or a separate llvm repository?).
> I am not sure about the official process.
> Can anyone help with this?
>  
> Regards,
> Daniel
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
> 
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>  
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20151214/bd608833/attachment.html>