[cfe-dev] scan-build in python

Tue Aug 26 07:12:54 PDT 2014

hi Jordan,

thanks for your message. most of these could have said before. maybe on
that 'open project' page. ;)

to address some of your concerns. i also would explain where i am, and
where i'm heading to with this... first, i'm learning the python language
with this task. therefore i'm thinking about myself as a newbie and i
believe the code i'm writing is not that hard to read for other less
experienced python developers... part of it, i'm interested how to write
code in this new foreign language which is testable. (and for example, this
was the reason why i choose continuations instead of simple method calls or
class+methods. i found it as a good compromise between simple and testable.)

about the compilation database issue: the current implementation does a lot
of things at once. i would rather separate the major parts... and i wanted
to do the rewrite in 3 steps... the 1st step (i've finished already)
rewrote the 'ccc-analyzer' to python, which is able to work with the perl
version of 'scan-build'. to keep the original borderlines between processes
let me check it to produce the exact same output... currently i'm working
on the 2nd step, when 'scan-build' is replaced by 'beye' working from
compilation database. this already diverge from the original at many
places. (i consider those as improvements :)) depending my other
activities, this might be finished soon... at the 3rd step, would add a
compilation database generator to mimic the original behavior. i already
have a project (called 'bear') which does this job. and with a lot of help
from other people, managed to became stable, did ported to many OSes and
well distributed. my plan to integrate 'bear' (rewrite some C code to
python) with 'beye' to make a full replacement for 'scan-build'. this way
the Clang project would win -not only a rewrite, but- a compilation
database generator.

about distribution: my plan to create a python package... found that LLVM
'lit' command is also a standalone python package, integrated into the LLVM
source tree. but also available on PyPI. (i'm using the PyPI package at
travis.ci jobs, since many distro packages does not install it.)

to not make it longer, would summarize how am i targeting those goals you
mentioned. please recommend me other ways if you can... to re-use the
'scan-build' parts to be able to check any project with any build system:
no. i would reuse the 'bear' parts, since that covers better the build
systems... to be able to work with any build system: yes...
maintainability: i'm writing unit tests and have a small amount of
functional tests (these are not yet checked in). using 'pep8' tool to be
conform with python style. and i'm trying my best to write documentation...
easy to distribute: using travis-ci to check it works on many python
versions (2.7, 3.2, 3.3 and 3.4 are currently targeted). create PyPI
package is planed... about multiple files and/or using classes instead of
passing dictionaries to methods, i am open for those if that helps in any
way. :)

did not wanted to be this long. would not make more noise on this mailing
list about it. wanted to come back when i'm finished... till then i'm
collecting my questions/comments on the github issue tracker. feel free to
answer anywhere. ;)

regards,
Laszlo

On Mon, Aug 25, 2014 at 8:24 PM, Jordan Rose <jordan_rose at apple.com> wrote:

> Hi, Laszlo. Sorry for going silent for, um, months; Swift has been taking
> a lot of our time. But we realized that listing the project on the "Open
> Projects" list without any real context was probably not a great idea. I'd
> like to take a step back and talk about where we see this going.
>
> scan-build has been around pretty much as long as the analyzer has; it was
> (and is) a cheap way to piggy-back on an existing build system to get the
> analyzer to run on a project without much work. It already does that, and
> it's good at that, but the current implementation has some problems.
>
> - *It's not necessarily so clean.* Ted admits that the current
> implementation may not be the cleanest code; Perl-isms aside, it has grown
> in one direction and then another over the years to implement various
> enhancements. Both scan-build and ccc-analyze could use cleanups.
>
> - *It's not tested.* We don't have a single public test that runs
> scan-build or even ccc-analyze. Apple has some tests internally, but we
> haven't done anything with them to make them accessible to open-source
> contributors.
>
> - *It's written in Perl*. LLVM has a lot more Python in it than Perl,
> include the Python bindings and even the scan-view tool we ship with
> scan-build. Being Perl is currently a bit of a barrier to entry to working
> on scan-build. (The other obvious choice, C++ "like the rest of LLVM", has
> the disadvantage of requiring compilation, which doesn't play well with
> extensibility.)
>
> What we'd like from a hypothetical scan-build replacement would fix these
> issues, but also give us a good base to go on for the future:
>
> - *Reusable / Extensible.* You're using Beye to handle analyzing files
> based on a compilation database rather than an existing build system.
> Wouldn't it have been nice to have been able to reuse parts of scan-build
> instead?
>
> - *Maintainable.* As you've seen, I haven't been so sure of what
> everything in the current scan-build / ccc-analyze is for. Ted could
> probably still tell you, but he's inherently busy due to being a manager.
> It's not really a good thing if only one person knows how something works!
> That's true in too many parts of Clang already; we should endeavour to make
> that *less* true whenever possible.
>
> - *Easy to Distribute.* The current Perl code does have one advantage:
> pretty much all Unix systems have a Perl as part of their base
> installation. Several years ago the same wasn't true of Python, but I think
> that's changed. Even so, we should make sure it's still easy to ship an
> analyzer build, scan-build included, on the platforms we care about. (This
> also includes minimizing dependencies for both developers and users of the
> tool, so thanks for already keeping that in mind.)
>
> So. Given all that, maybe some of my original objections make a little
> more sense now. A lot of what you've done here has been nice work, but I
> don't see it being easy for someone without too much experience with Python
> to be able to walk up and change some piece of it, and have us be confident
> that it's not going to cause problems somewhere else. I've seen this happen
> at least a few times with the Perl implementation already.
>
> (Or, to put it another way, the current implementation is all in Ted's
> head. This one's all in *your* head. So we didn't solve the problem yet.)
>
> I wonder if part of the problem is following the Perl implementation *too* closely.
> Rather than pass around dictionaries of options, why not use an actual
> Invocation object or similar? Instead of using continuations, why not just
> use normal method calls? (I'm not convinced the auto-chaining has enough
> real benefit, but even if it did you could put that all into your stack()
> implementation. FWIW I also don't understand the name "stack".)
>
> I'm also not afraid of breaking this out into multiple files. The cost of
> loading additional files shouldn't matter compared to the actual time to
> analyze. At least, I hope not.
>
> I'll try to answer some of your specific questions from the last few
> months in a second e-mail, but hopefully this gives you a better picture of
> our vision for scan-build's future. As such, we should be trying to make it
> "as simple as possible, but not simpler". :-)
>
> Thanks again for working on this,
> Jordan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140826/4b38d47b/attachment.html>