<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Feb 18, 2015, at 2:50 AM, Vassil Vassilev <<a href="mailto:vvasilev@cern.ch" class="">vvasilev@cern.ch</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
<meta content="text/html; charset=windows-1252" http-equiv="Content-Type" class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">That's great! What would be the next
steps? Do you know who will be the GSoC org admin? </div></div></div></blockquote><div><br class=""></div>There was an email sent about GCoC a couple of days ago to the LLVMDev list.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div text="#000000" bgcolor="#FFFFFF" class=""><div class="moz-cite-prefix">Do you think we
should improve the project description</div></div></div></blockquote><div><br class=""></div><div>I think adding specific examples that we want to handle would be useful in scoping this down.</div><br class=""><blockquote type="cite" class=""><div class=""><div text="#000000" bgcolor="#FFFFFF" class=""><div class="moz-cite-prefix"> and nominate a backup
mentor?<br class="">
Vassil<br class="">
On 17/02/15 20:05, Anna Zaks wrote:<br class="">
</div>
<blockquote cite="mid:DEA2B2DD-85C9-4BE2-A37C-775EC94FCD7C@apple.com" type="cite" class="">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252" class="">
<div class="">This would be a very useful feature to have in the
clang static analyzer and can be scoped for a GSoC project!</div>
<div class=""><br class="">
</div>
<div class="">Anna.</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">
<blockquote type="cite" class="">
<div class="">On Feb 10, 2015, at 4:06 AM, Vassil Vassilev
<<a moz-do-not-send="true" href="mailto:vvasilev@cern.ch" class="">vvasilev@cern.ch</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">Hi all,<br class="">
I just wanted to bump this up (given GSoC is
starting). I didn't manage to get a good student for
this project (proposal is below) last year :(. I
thought maybe if we went through the LLVM mentoring
organization would be better. Do you think this would
make a good GSoC project from Clang's perspective? I'd
be happy to update the proposal to make it more
attractive or general-purpose.<br class="">
Vassil<br class="">
<br class="">
<h3 class="">Code copy/paste detection</h3>
<div class=""><strong class="">Description</strong>:The
copy/paste is common programming practice. Most of
the programmers start from a code snippet that
already exists in the system and modify it to match
their needs. Easily some of the code snippets end up
being copied dozens of times, which leads to worse
maintainability, understandability and logical
design. <a moz-do-not-send="true" class="ext" href="http://clang.llvm.org/">Clang<span class="ext"><span class="element-invisible">
(link is external)</span></span></a> and <a moz-do-not-send="true" class="ext" href="http://http//clang-analyzer.llvm.org/">clang's
static analyzer<span class="ext"><span class="element-invisible"> (link is external)</span></span></a>
provide all the building blocks to build a generic
C/C++ copy/paste detector.</div>
<div class=""><strong class="">Expected results</strong>:Build
a standalone tool or clang plugin being able to
detect copy/pasted code.</div></div></div></div></blockquote></div></div></blockquote></div></div></blockquote><div><br class=""></div><div>I think having this integrated into one of the existing clang tools should the be the goal. For example, the static analyzer is a good fit. The static analyzer does not have plugins.</div><br class=""><blockquote type="cite" class=""><div class=""><div text="#000000" bgcolor="#FFFFFF" class=""><blockquote cite="mid:DEA2B2DD-85C9-4BE2-A37C-775EC94FCD7C@apple.com" type="cite" class=""><div class=""><div class=""><blockquote type="cite" class=""><div class=""><div text="#000000" bgcolor="#FFFFFF" class=""><div class="moz-cite-prefix"><div class=""> Lay the foundations of
detection of slightly modified code (semantic
analysis required). Implement tests for all the
realized functionality. Prepare a final poster of
the work and be ready to present it.</div>
<div class=""><strong class="">Required knowledge</strong>:
Advanced C++, Basic knowledge of Clang/Clang Static
Analyzer.</div><p class=""><strong class="">Mentor</strong>: Vassil
Vassilev/ maybe somebody else as second mentor?<a moz-do-not-send="true" class="mailto" href="mailto:sft-gsoc-AT-cern-dot-ch?subject=GSoC%202014%20Extending%20Cling"><span class="mailto"><br class="">
</span></a></p>
<br class="">
On 07/02/14 22:20, Nick Lewycky wrote:<br class="">
</div>
<blockquote cite="mid:CADbEz-hdxzO6VFrRPewungnLxAPKZ7po1C07r5STaeV8z_+qpg@mail.gmail.com" type="cite" class="">
<div dir="ltr" class="">
<div class="gmail_extra">
<div class="gmail_quote">On 7 February 2014 04:49,
Vassil Vassilev <span dir="ltr" class=""><<a moz-do-not-send="true" href="mailto:vvasilev@cern.ch" target="_blank" class="">vvasilev@cern.ch</a>></span>
wrote:<br class="">
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000" class="">
<div class="im">
<div class="">On 05/02/14 21:32, Nick
Lewycky wrote:<br class="">
</div>
<blockquote type="cite" class="">
<div dir="ltr" class="">
<div class="gmail_extra">
<div class="gmail_quote">On 3
February 2014 14:08, Richard <span dir="ltr" class=""><<a moz-do-not-send="true" href="mailto:legalize@xmission.com" target="_blank" class="">legalize@xmission.com</a>></span>
wrote:<br class="">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br class="">
In article <<a moz-do-not-send="true" href="mailto:CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw@mail.gmail.com" target="_blank" class="">CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw@mail.gmail.com</a>>,<br class="">
<div class=""> David Blaikie
<<a moz-do-not-send="true" href="mailto:dblaikie@gmail.com" target="_blank" class="">dblaikie@gmail.com</a>>
writes:<br class="">
<br class="">
> On Mon, Feb 3, 2014 at
3:06 AM, Vassil Vassilev <<a moz-do-not-send="true" href="mailto:vvasilev@cern.ch" target="_blank" class="">vvasilev@cern.ch</a>>
wrote:<br class="">
><br class="">
</div>
<div class="">> > A few
months ago I was looking for a
copy-paste detector for a C++<br class="">
> > project. I didn't
find such a feature of clang's
static analyzer. Is this<br class="">
> > the case?<br class="">
><br class="">
> copy-paste detector? As
in plagarism detection?<br class="">
<br class="">
</div>
I don't think plagiarism is the
concern. The conern is that<br class="">
copy/paste of blocks of code
where the pasted block needs to
be<br class="">
updated in several places, but
not all of the updates were
performed.<br class="">
</blockquote>
<div class=""><br class="">
</div>
<div class="">I've implemented
this sort of thing, but it's
only 80% finished and has been
kicking around on the
low-priority end of my todo list
for the past couple of years.
Patch attached. It'd be great if
someone were interested in
finishing this off. I won't get
to it soon.</div>
<div class=""><br class="">
</div>
<div class="">Note that it's a
warning instead of a static
analysis check which means that
it must have an aggressively low
number of false positives, and
that it must be run quickly. The
implementation I have analyzes
conditional operators and
if/elseif chains, but doesn't
collect all the expressions
through something like a
&& b &&c
&& a. That would be the
next thing to add.</div>
<div class=""><br class="">
</div>
<div class="">It does have some
really cool properties that we
can only get because clang
integrates closely with its
preprocessor. Consider this
sample from the testcase:</div>
<div class=""><br class="">
#define num_cpus() (1)<br class="">
#define max_omp_threads() (1)<br class="">
int test8(int expr) {<br class="">
if (expr) {<br class="">
return num_cpus();<br class="">
} else {<br class="">
return max_omp_threads();<br class="">
}<br class="">
}</div>
<div class=""><br class="">
</div>
<div class="">We know better than
to warn on that, even though the
AST looks the same. If you
instead write "return
num_cpus();" twice, we warn on
that (that's test9 in the
testsuite).</div>
<div class=""><br class="">
</div>
<div class="">Nick</div>
</div>
</div>
</div>
</blockquote>
</div>
Thanks this looks very interesting. This may
be a good start for a student. IIUC a
non-unique expr is the ones that have same
source ranges and same FileIDs, right? Could
this be upgraded to AST-node (structural)
comparison?</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">It is an AST-node comparison. In
order to handle the case of different macros,
we ask the AST nodes what their SourceLocation
was, and factor in the macroid, if there was
one. A large part of the patch is a change to
the Stmt::profile logic to look at all the
sourcelocations in all the possible AST nodes.</div>
<div class=""> </div>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000" class=""><span class="HOEnZb"><font class="" color="#888888"><br class="">
Vassil</font></span>
<div class="im"><br class="">
<blockquote type="cite" class="">
<div dir="ltr" class="">
<div class="gmail_extra">
<div class="gmail_quote">
<div class=""><br class="">
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Coverity
can detect such instances, for
instance.<br class="">
<br class="">
Here is an article from 2006
describing such a tool:<br class="">
<<a moz-do-not-send="true" href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.113" target="_blank" class="">http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.113</a>><br class="">
<br class="">
Wikipedia says PMD has a
copy/paste detector that works
with C++:<br class="">
<<a moz-do-not-send="true" href="http://en.wikipedia.org/wiki/PMD_%28software%29#Copy.2FPaste_Detector_.28CPD.29" target="_blank" class="">http://en.wikipedia.org/wiki/PMD_(software)#Copy.2FPaste_Detector_.28CPD.29</a>><br class="">
<br class="">
"Note that CPD works with Java,
JSP, C, C++, C#, Fortran and PHP
code.<br class="">
Your own language is missing ?
See how to add it here"<br class="">
<<a moz-do-not-send="true" href="http://pmd.sourceforge.net/snapshot/cpd-usage.html" target="_blank" class="">http://pmd.sourceforge.net/snapshot/cpd-usage.html</a>><br class="">
<span class=""><font class="" color="#888888">--<br class="">
"The Direct3D Graphics
Pipeline" free book <<a moz-do-not-send="true" href="http://tinyurl.com/d3d-pipeline" target="_blank" class="">http://tinyurl.com/d3d-pipeline</a>><br class="">
The Computer Graphics
Museum <<a moz-do-not-send="true" href="http://computergraphicsmuseum.org/" target="_blank" class="">http://ComputerGraphicsMuseum.org</a>><br class="">
The Terminals Wiki
<<a moz-do-not-send="true" href="http://terminals.classiccmp.org/" target="_blank" class="">http://terminals.classiccmp.org</a>><br class="">
Legalize Adulthood! (my
blog) <<a moz-do-not-send="true" href="http://legalizeadulthood.wordpress.com/" target="_blank" class="">http://LegalizeAdulthood.wordpress.com</a>><br class="">
</font></span>
<div class="">
<div class="">_______________________________________________<br class="">
cfe-dev mailing list<br class="">
<a moz-do-not-send="true" href="mailto:cfe-dev@cs.uiuc.edu" target="_blank" class="">cfe-dev@cs.uiuc.edu</a><br class="">
<a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
<br class="">
<fieldset class=""></fieldset>
<br class="">
<pre class="">_______________________________________________
cfe-dev mailing list
<a moz-do-not-send="true" href="mailto:cfe-dev@cs.uiuc.edu" target="_blank" class="">cfe-dev@cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a>
</pre>
</blockquote>
<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
</blockquote>
<br class="">
<br class="">
<pre class="moz-signature" cols="72">--
--------------------------------------------
Q: Why is this email five sentences or less?
A: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://five.sentenc.es/">http://five.sentenc.es</a>
</pre>
</div>
_______________________________________________<br class="">
cfe-dev mailing list<br class="">
<a moz-do-not-send="true" href="mailto:cfe-dev@cs.uiuc.edu" class="">cfe-dev@cs.uiuc.edu</a><br class="">
<a class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br class="">
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div></blockquote></div><br class=""></body></html>