[cfe-dev] scan-build in python

Mon Aug 25 11:24:23 PDT 2014

Hi, Laszlo. Sorry for going silent for, um, months; Swift has been taking a lot of our time. But we realized that listing the project on the "Open Projects" list without any real context was probably not a great idea. I'd like to take a step back and talk about where we see this going.

scan-build has been around pretty much as long as the analyzer has; it was (and is) a cheap way to piggy-back on an existing build system to get the analyzer to run on a project without much work. It already does that, and it's good at that, but the current implementation has some problems.

- It's not necessarily so clean. Ted admits that the current implementation may not be the cleanest code; Perl-isms aside, it has grown in one direction and then another over the years to implement various enhancements. Both scan-build and ccc-analyze could use cleanups.

- It's not tested. We don't have a single public test that runs scan-build or even ccc-analyze. Apple has some tests internally, but we haven't done anything with them to make them accessible to open-source contributors.

- It's written in Perl. LLVM has a lot more Python in it than Perl, include the Python bindings and even the scan-view tool we ship with scan-build. Being Perl is currently a bit of a barrier to entry to working on scan-build. (The other obvious choice, C++ "like the rest of LLVM", has the disadvantage of requiring compilation, which doesn't play well with extensibility.)

What we'd like from a hypothetical scan-build replacement would fix these issues, but also give us a good base to go on for the future:

- Reusable / Extensible. You're using Beye to handle analyzing files based on a compilation database rather than an existing build system. Wouldn't it have been nice to have been able to reuse parts of scan-build instead?

- Maintainable. As you've seen, I haven't been so sure of what everything in the current scan-build / ccc-analyze is for. Ted could probably still tell you, but he's inherently busy due to being a manager. It's not really a good thing if only one person knows how something works! That's true in too many parts of Clang already; we should endeavour to make that less true whenever possible.

- Easy to Distribute. The current Perl code does have one advantage: pretty much all Unix systems have a Perl as part of their base installation. Several years ago the same wasn't true of Python, but I think that's changed. Even so, we should make sure it's still easy to ship an analyzer build, scan-build included, on the platforms we care about. (This also includes minimizing dependencies for both developers and users of the tool, so thanks for already keeping that in mind.)

So. Given all that, maybe some of my original objections make a little more sense now. A lot of what you've done here has been nice work, but I don't see it being easy for someone without too much experience with Python to be able to walk up and change some piece of it, and have us be confident that it's not going to cause problems somewhere else. I've seen this happen at least a few times with the Perl implementation already.

(Or, to put it another way, the current implementation is all in Ted's head. This one's all in your head. So we didn't solve the problem yet.)

I wonder if part of the problem is following the Perl implementation too closely. Rather than pass around dictionaries of options, why not use an actual Invocation object or similar? Instead of using continuations, why not just use normal method calls? (I'm not convinced the auto-chaining has enough real benefit, but even if it did you could put that all into your stack() implementation. FWIW I also don't understand the name "stack".)

I'm also not afraid of breaking this out into multiple files. The cost of loading additional files shouldn't matter compared to the actual time to analyze. At least, I hope not.

I'll try to answer some of your specific questions from the last few months in a second e-mail, but hopefully this gives you a better picture of our vision for scan-build's future. As such, we should be trying to make it "as simple as possible, but not simpler". :-)

Thanks again for working on this,
Jordan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140825/44508fa5/attachment.html>