[cfe-dev] CopyPaste detection clang static analyzer

Nick Lewycky nlewycky at google.com
Fri Feb 7 13:20:26 PST 2014


On 7 February 2014 04:49, Vassil Vassilev <vvasilev at cern.ch> wrote:

>  On 05/02/14 21:32, Nick Lewycky wrote:
>
>  On 3 February 2014 14:08, Richard <legalize at xmission.com> wrote:
>
>>
>> In article <
>> CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw at mail.gmail.com>,
>>     David Blaikie <dblaikie at gmail.com> writes:
>>
>> > On Mon, Feb 3, 2014 at 3:06 AM, Vassil Vassilev <vvasilev at cern.ch>
>> wrote:
>> >
>>  > >   A few months ago I was looking for a copy-paste detector for a C++
>> > > project. I didn't find such a feature of clang's static analyzer. Is
>> this
>> > > the case?
>> >
>> > copy-paste detector? As in plagarism detection?
>>
>>  I don't think plagiarism is the concern.  The conern is that
>> copy/paste of blocks of code where the pasted block needs to be
>> updated in several places, but not all of the updates were performed.
>>
>
>  I've implemented this sort of thing, but it's only 80% finished and has
> been kicking around on the low-priority end of my todo list for the past
> couple of years. Patch attached. It'd be great if someone were interested
> in finishing this off. I won't get to it soon.
>
>  Note that it's a warning instead of a static analysis check which means
> that it must have an aggressively low number of false positives, and that
> it must be run quickly. The implementation I have analyzes conditional
> operators and if/elseif chains, but doesn't collect all the expressions
> through something like a && b &&c && a. That would be the next thing to add.
>
>  It does have some really cool properties that we can only get because
> clang integrates closely with its preprocessor. Consider this sample from
> the testcase:
>
> #define num_cpus() (1)
> #define max_omp_threads() (1)
> int test8(int expr) {
>   if (expr) {
>     return num_cpus();
>   } else {
>     return max_omp_threads();
>   }
> }
>
>  We know better than to warn on that, even though the AST looks the same.
> If you instead write "return num_cpus();" twice, we warn on that (that's
> test9 in the testsuite).
>
>  Nick
>
> Thanks this looks very interesting. This may be a good start for a
> student. IIUC a non-unique expr is the ones that have same source ranges
> and same FileIDs, right? Could this be upgraded to AST-node (structural)
> comparison?
>

It is an AST-node comparison. In order to handle the case of different
macros, we ask the AST nodes what their SourceLocation was, and factor in
the macroid, if there was one. A large part of the patch is a change to the
Stmt::profile logic to look at all the sourcelocations in all the possible
AST nodes.


>
> Vassil
>
>
>  Coverity can detect such instances, for instance.
>>
>> Here is an article from 2006 describing such a tool:
>> <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.113>
>>
>> Wikipedia says PMD has a copy/paste detector that works with C++:
>> <
>> http://en.wikipedia.org/wiki/PMD_(software)#Copy.2FPaste_Detector_.28CPD.29
>> >
>>
>> "Note that CPD works with Java, JSP, C, C++, C#, Fortran and PHP code.
>> Your own language is missing ? See how to add it here"
>> <http://pmd.sourceforge.net/snapshot/cpd-usage.html>
>> --
>> "The Direct3D Graphics Pipeline" free book <
>> http://tinyurl.com/d3d-pipeline>
>>      The Computer Graphics Museum <http://ComputerGraphicsMuseum.org>
>>          The Terminals Wiki <http://terminals.classiccmp.org>
>>   Legalize Adulthood! (my blog) <http://LegalizeAdulthood.wordpress.com>
>>  _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>
>
>
> _______________________________________________
> cfe-dev mailing listcfe-dev at cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140207/bd877d41/attachment.html>


More information about the cfe-dev mailing list