[cfe-dev] CopyPaste detection clang static analyzer

Vassil Vassilev vvasilev at cern.ch
Fri Feb 7 04:49:52 PST 2014


On 05/02/14 21:32, Nick Lewycky wrote:
> On 3 February 2014 14:08, Richard <legalize at xmission.com 
> <mailto:legalize at xmission.com>> wrote:
>
>
>     In article
>     <CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw at mail.gmail.com <mailto:CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw at mail.gmail.com>>,
>         David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>>
>     writes:
>
>     > On Mon, Feb 3, 2014 at 3:06 AM, Vassil Vassilev
>     <vvasilev at cern.ch <mailto:vvasilev at cern.ch>> wrote:
>     >
>     > >   A few months ago I was looking for a copy-paste detector for
>     a C++
>     > > project. I didn't find such a feature of clang's static
>     analyzer. Is this
>     > > the case?
>     >
>     > copy-paste detector? As in plagarism detection?
>
>     I don't think plagiarism is the concern.  The conern is that
>     copy/paste of blocks of code where the pasted block needs to be
>     updated in several places, but not all of the updates were performed.
>
>
> I've implemented this sort of thing, but it's only 80% finished and 
> has been kicking around on the low-priority end of my todo list for 
> the past couple of years. Patch attached. It'd be great if someone 
> were interested in finishing this off. I won't get to it soon.
>
> Note that it's a warning instead of a static analysis check which 
> means that it must have an aggressively low number of false positives, 
> and that it must be run quickly. The implementation I have analyzes 
> conditional operators and if/elseif chains, but doesn't collect all 
> the expressions through something like a && b &&c && a. That would be 
> the next thing to add.
>
> It does have some really cool properties that we can only get because 
> clang integrates closely with its preprocessor. Consider this sample 
> from the testcase:
>
> #define num_cpus() (1)
> #define max_omp_threads() (1)
> int test8(int expr) {
>   if (expr) {
>     return num_cpus();
>   } else {
>     return max_omp_threads();
>   }
> }
>
> We know better than to warn on that, even though the AST looks the 
> same. If you instead write "return num_cpus();" twice, we warn on that 
> (that's test9 in the testsuite).
>
> Nick
Thanks this looks very interesting. This may be a good start for a 
student. IIUC a non-unique expr is the ones that have same source ranges 
and same FileIDs, right? Could this be upgraded to AST-node (structural) 
comparison?
Vassil
>
>     Coverity can detect such instances, for instance.
>
>     Here is an article from 2006 describing such a tool:
>     <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.113>
>
>     Wikipedia says PMD has a copy/paste detector that works with C++:
>     <http://en.wikipedia.org/wiki/PMD_(software)#Copy.2FPaste_Detector_.28CPD.29
>     <http://en.wikipedia.org/wiki/PMD_%28software%29#Copy.2FPaste_Detector_.28CPD.29>>
>
>     "Note that CPD works with Java, JSP, C, C++, C#, Fortran and PHP code.
>     Your own language is missing ? See how to add it here"
>     <http://pmd.sourceforge.net/snapshot/cpd-usage.html>
>     --
>     "The Direct3D Graphics Pipeline" free book
>     <http://tinyurl.com/d3d-pipeline>
>          The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
>              The Terminals Wiki <http://terminals.classiccmp.org>
>       Legalize Adulthood! (my blog)
>     <http://LegalizeAdulthood.wordpress.com>
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>     http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140207/b69e4ea1/attachment.html>


More information about the cfe-dev mailing list