[cfe-dev] CopyPaste detection clang static analyzer
Vassil Vassilev
vvasilev at cern.ch
Fri Feb 7 04:49:52 PST 2014
On 05/02/14 21:32, Nick Lewycky wrote:
> On 3 February 2014 14:08, Richard <legalize at xmission.com
> <mailto:legalize at xmission.com>> wrote:
>
>
> In article
> <CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw at mail.gmail.com <mailto:CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw at mail.gmail.com>>,
> David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>>
> writes:
>
> > On Mon, Feb 3, 2014 at 3:06 AM, Vassil Vassilev
> <vvasilev at cern.ch <mailto:vvasilev at cern.ch>> wrote:
> >
> > > A few months ago I was looking for a copy-paste detector for
> a C++
> > > project. I didn't find such a feature of clang's static
> analyzer. Is this
> > > the case?
> >
> > copy-paste detector? As in plagarism detection?
>
> I don't think plagiarism is the concern. The conern is that
> copy/paste of blocks of code where the pasted block needs to be
> updated in several places, but not all of the updates were performed.
>
>
> I've implemented this sort of thing, but it's only 80% finished and
> has been kicking around on the low-priority end of my todo list for
> the past couple of years. Patch attached. It'd be great if someone
> were interested in finishing this off. I won't get to it soon.
>
> Note that it's a warning instead of a static analysis check which
> means that it must have an aggressively low number of false positives,
> and that it must be run quickly. The implementation I have analyzes
> conditional operators and if/elseif chains, but doesn't collect all
> the expressions through something like a && b &&c && a. That would be
> the next thing to add.
>
> It does have some really cool properties that we can only get because
> clang integrates closely with its preprocessor. Consider this sample
> from the testcase:
>
> #define num_cpus() (1)
> #define max_omp_threads() (1)
> int test8(int expr) {
> if (expr) {
> return num_cpus();
> } else {
> return max_omp_threads();
> }
> }
>
> We know better than to warn on that, even though the AST looks the
> same. If you instead write "return num_cpus();" twice, we warn on that
> (that's test9 in the testsuite).
>
> Nick
Thanks this looks very interesting. This may be a good start for a
student. IIUC a non-unique expr is the ones that have same source ranges
and same FileIDs, right? Could this be upgraded to AST-node (structural)
comparison?
Vassil
>
> Coverity can detect such instances, for instance.
>
> Here is an article from 2006 describing such a tool:
> <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.113>
>
> Wikipedia says PMD has a copy/paste detector that works with C++:
> <http://en.wikipedia.org/wiki/PMD_(software)#Copy.2FPaste_Detector_.28CPD.29
> <http://en.wikipedia.org/wiki/PMD_%28software%29#Copy.2FPaste_Detector_.28CPD.29>>
>
> "Note that CPD works with Java, JSP, C, C++, C#, Fortran and PHP code.
> Your own language is missing ? See how to add it here"
> <http://pmd.sourceforge.net/snapshot/cpd-usage.html>
> --
> "The Direct3D Graphics Pipeline" free book
> <http://tinyurl.com/d3d-pipeline>
> The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
> The Terminals Wiki <http://terminals.classiccmp.org>
> Legalize Adulthood! (my blog)
> <http://LegalizeAdulthood.wordpress.com>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140207/b69e4ea1/attachment.html>
More information about the cfe-dev
mailing list