[cfe-dev] CopyPaste detection clang static analyzer

Nick Lewycky nlewycky at google.com
Wed Feb 5 12:32:07 PST 2014


On 3 February 2014 14:08, Richard <legalize at xmission.com> wrote:

>
> In article <
> CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw at mail.gmail.com>,
>     David Blaikie <dblaikie at gmail.com> writes:
>
> > On Mon, Feb 3, 2014 at 3:06 AM, Vassil Vassilev <vvasilev at cern.ch>
> wrote:
> >
> > >   A few months ago I was looking for a copy-paste detector for a C++
> > > project. I didn't find such a feature of clang's static analyzer. Is
> this
> > > the case?
> >
> > copy-paste detector? As in plagarism detection?
>
> I don't think plagiarism is the concern.  The conern is that
> copy/paste of blocks of code where the pasted block needs to be
> updated in several places, but not all of the updates were performed.
>

I've implemented this sort of thing, but it's only 80% finished and has
been kicking around on the low-priority end of my todo list for the past
couple of years. Patch attached. It'd be great if someone were interested
in finishing this off. I won't get to it soon.

Note that it's a warning instead of a static analysis check which means
that it must have an aggressively low number of false positives, and that
it must be run quickly. The implementation I have analyzes conditional
operators and if/elseif chains, but doesn't collect all the expressions
through something like a && b &&c && a. That would be the next thing to add.

It does have some really cool properties that we can only get because clang
integrates closely with its preprocessor. Consider this sample from the
testcase:

#define num_cpus() (1)
#define max_omp_threads() (1)
int test8(int expr) {
  if (expr) {
    return num_cpus();
  } else {
    return max_omp_threads();
  }
}

We know better than to warn on that, even though the AST looks the same. If
you instead write "return num_cpus();" twice, we warn on that (that's test9
in the testsuite).

Nick

Coverity can detect such instances, for instance.
>
> Here is an article from 2006 describing such a tool:
> <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.113>
>
> Wikipedia says PMD has a copy/paste detector that works with C++:
> <
> http://en.wikipedia.org/wiki/PMD_(software)#Copy.2FPaste_Detector_.28CPD.29
> >
>
> "Note that CPD works with Java, JSP, C, C++, C#, Fortran and PHP code.
> Your own language is missing ? See how to add it here"
> <http://pmd.sourceforge.net/snapshot/cpd-usage.html>
> --
> "The Direct3D Graphics Pipeline" free book <
> http://tinyurl.com/d3d-pipeline>
>      The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
>          The Terminals Wiki <http://terminals.classiccmp.org>
>   Legalize Adulthood! (my blog) <http://LegalizeAdulthood.wordpress.com>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140205/72b568a1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang-useless-condition-1.patch
Type: application/octet-stream
Size: 51641 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140205/72b568a1/attachment.obj>


More information about the cfe-dev mailing list