<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 24/02/15 06:15, Anna Zaks wrote:<br>
</div>
<blockquote
cite="mid:95677D8A-FCEF-4A75-AA28-52D01DCF35C6@apple.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On Feb 18, 2015, at 2:50 AM, Vassil Vassilev
<<a moz-do-not-send="true" href="mailto:vvasilev@cern.ch"
class="">vvasilev@cern.ch</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">That's great! What would be
the next steps? Do you know who will be the GSoC org
admin? </div>
</div>
</div>
</blockquote>
<div><br class="">
</div>
There was an email sent about GCoC a couple of days ago to the
LLVMDev list.</div>
</blockquote>
Thanks for the information. I addressed all of your comments and
sent a patch to OpenProjects.html, cc-ing also you, Anna, for a
review.<br>
Many thanks,<br>
Vassil<br>
<blockquote
cite="mid:95677D8A-FCEF-4A75-AA28-52D01DCF35C6@apple.com"
type="cite">
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">Do you think we should
improve the project description</div>
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>I think adding specific examples that we want to handle
would be useful in scoping this down.</div>
</div>
</blockquote>
<blockquote
cite="mid:95677D8A-FCEF-4A75-AA28-52D01DCF35C6@apple.com"
type="cite">
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix"> and nominate a backup
mentor?<br class="">
Vassil<br class="">
On 17/02/15 20:05, Anna Zaks wrote:<br class="">
</div>
<blockquote
cite="mid:DEA2B2DD-85C9-4BE2-A37C-775EC94FCD7C@apple.com"
type="cite" class="">
<div class="">This would be a very useful feature to
have in the clang static analyzer and can be scoped
for a GSoC project!</div>
<div class=""><br class="">
</div>
<div class="">Anna.</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">
<blockquote type="cite" class="">
<div class="">On Feb 10, 2015, at 4:06 AM, Vassil
Vassilev <<a moz-do-not-send="true"
href="mailto:vvasilev@cern.ch" class="">vvasilev@cern.ch</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">Hi all,<br
class="">
I just wanted to bump this up (given GSoC
is starting). I didn't manage to get a good
student for this project (proposal is below)
last year :(. I thought maybe if we went
through the LLVM mentoring organization
would be better. Do you think this would
make a good GSoC project from Clang's
perspective? I'd be happy to update the
proposal to make it more attractive or
general-purpose.<br class="">
Vassil<br class="">
<br class="">
<h3 class="">Code copy/paste detection</h3>
<div class=""><strong class="">Description</strong>:The
copy/paste is common programming practice.
Most of the programmers start from a code
snippet that already exists in the system
and modify it to match their needs. Easily
some of the code snippets end up being
copied dozens of times, which leads to
worse maintainability, understandability
and logical design. <a
moz-do-not-send="true" class="ext"
href="http://clang.llvm.org/">Clang<span
class="ext"><span
class="element-invisible"> (link is
external)</span></span></a> and <a
moz-do-not-send="true" class="ext"
href="http://http//clang-analyzer.llvm.org/">clang's
static analyzer<span class="ext"><span
class="element-invisible"> (link is
external)</span></span></a> provide
all the building blocks to build a generic
C/C++ copy/paste detector.</div>
<div class=""><strong class="">Expected
results</strong>:Build a standalone tool
or clang plugin being able to detect
copy/pasted code.</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>I think having this integrated into one of the existing
clang tools should the be the goal. For example, the static
analyzer is a good fit. The static analyzer does not have
plugins.</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<blockquote
cite="mid:DEA2B2DD-85C9-4BE2-A37C-775EC94FCD7C@apple.com"
type="cite" class="">
<div class="">
<div class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">
<div class=""> Lay the foundations of
detection of slightly modified code
(semantic analysis required). Implement
tests for all the realized functionality.
Prepare a final poster of the work and be
ready to present it.</div>
<div class=""><strong class="">Required
knowledge</strong>: Advanced C++, Basic
knowledge of Clang/Clang Static Analyzer.</div>
<p class=""><strong class="">Mentor</strong>:
Vassil Vassilev/ maybe somebody else as
second mentor?<a moz-do-not-send="true"
class="mailto"
href="mailto:sft-gsoc-AT-cern-dot-ch?subject=GSoC%202014%20Extending%20Cling"><span
class="mailto"><br class="">
</span></a></p>
<br class="">
On 07/02/14 22:20, Nick Lewycky wrote:<br
class="">
</div>
<blockquote
cite="mid:CADbEz-hdxzO6VFrRPewungnLxAPKZ7po1C07r5STaeV8z_+qpg@mail.gmail.com"
type="cite" class="">
<div dir="ltr" class="">
<div class="gmail_extra">
<div class="gmail_quote">On 7 February
2014 04:49, Vassil Vassilev <span
dir="ltr" class=""><<a
moz-do-not-send="true"
href="mailto:vvasilev@cern.ch"
target="_blank" class="">vvasilev@cern.ch</a>></span>
wrote:<br class="">
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF"
text="#000000" class="">
<div class="im">
<div class="">On 05/02/14 21:32,
Nick Lewycky wrote:<br
class="">
</div>
<blockquote type="cite" class="">
<div dir="ltr" class="">
<div class="gmail_extra">
<div class="gmail_quote">On
3 February 2014 14:08,
Richard <span dir="ltr"
class=""><<a
moz-do-not-send="true"
href="mailto:legalize@xmission.com" target="_blank" class="">legalize@xmission.com</a>></span>
wrote:<br class="">
<blockquote
class="gmail_quote"
style="margin:0px 0px
0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br
class="">
In article <<a
moz-do-not-send="true"
href="mailto:CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw@mail.gmail.com"
target="_blank"
class="">CAENS6EsgzhXWfANFze8VAp68qDGHnrHNZJaaLmi28YJtnQwOmw@mail.gmail.com</a>>,<br
class="">
<div class="">
David Blaikie <<a
moz-do-not-send="true" href="mailto:dblaikie@gmail.com" target="_blank"
class="">dblaikie@gmail.com</a>>
writes:<br class="">
<br class="">
> On Mon, Feb 3,
2014 at 3:06 AM,
Vassil Vassilev <<a
moz-do-not-send="true" href="mailto:vvasilev@cern.ch" target="_blank"
class="">vvasilev@cern.ch</a>>
wrote:<br class="">
><br class="">
</div>
<div class="">>
> A few months
ago I was looking
for a copy-paste
detector for a C++<br
class="">
> > project. I
didn't find such a
feature of clang's
static analyzer. Is
this<br class="">
> > the case?<br
class="">
><br class="">
> copy-paste
detector? As in
plagarism detection?<br
class="">
<br class="">
</div>
I don't think
plagiarism is the
concern. The conern
is that<br class="">
copy/paste of blocks
of code where the
pasted block needs to
be<br class="">
updated in several
places, but not all of
the updates were
performed.<br class="">
</blockquote>
<div class=""><br
class="">
</div>
<div class="">I've
implemented this sort
of thing, but it's
only 80% finished and
has been kicking
around on the
low-priority end of my
todo list for the past
couple of years. Patch
attached. It'd be
great if someone were
interested in
finishing this off. I
won't get to it soon.</div>
<div class=""><br
class="">
</div>
<div class="">Note that
it's a warning instead
of a static analysis
check which means that
it must have an
aggressively low
number of false
positives, and that it
must be run quickly.
The implementation I
have analyzes
conditional operators
and if/elseif chains,
but doesn't collect
all the expressions
through something like
a && b
&&c &&
a. That would be the
next thing to add.</div>
<div class=""><br
class="">
</div>
<div class="">It does
have some really cool
properties that we can
only get because clang
integrates closely
with its preprocessor.
Consider this sample
from the testcase:</div>
<div class=""><br
class="">
#define num_cpus() (1)<br
class="">
#define
max_omp_threads() (1)<br
class="">
int test8(int expr) {<br
class="">
if (expr) {<br
class="">
return num_cpus();<br
class="">
} else {<br class="">
return
max_omp_threads();<br
class="">
}<br class="">
}</div>
<div class=""><br
class="">
</div>
<div class="">We know
better than to warn on
that, even though the
AST looks the same. If
you instead write
"return num_cpus();"
twice, we warn on that
(that's test9 in the
testsuite).</div>
<div class=""><br
class="">
</div>
<div class="">Nick</div>
</div>
</div>
</div>
</blockquote>
</div>
Thanks this looks very
interesting. This may be a good
start for a student. IIUC a
non-unique expr is the ones that
have same source ranges and same
FileIDs, right? Could this be
upgraded to AST-node (structural)
comparison?</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">It is an AST-node
comparison. In order to handle the
case of different macros, we ask the
AST nodes what their SourceLocation
was, and factor in the macroid, if
there was one. A large part of the
patch is a change to the
Stmt::profile logic to look at all
the sourcelocations in all the
possible AST nodes.</div>
<div class=""> </div>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF"
text="#000000" class=""><span
class="HOEnZb"><font class=""
color="#888888"><br class="">
Vassil</font></span>
<div class="im"><br class="">
<blockquote type="cite" class="">
<div dir="ltr" class="">
<div class="gmail_extra">
<div class="gmail_quote">
<div class=""><br
class="">
</div>
<blockquote
class="gmail_quote"
style="margin:0px 0px
0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Coverity
can detect such
instances, for
instance.<br class="">
<br class="">
Here is an article
from 2006 describing
such a tool:<br
class="">
<<a
moz-do-not-send="true"
href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.113"
target="_blank"
class="">http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.113</a>><br
class="">
<br class="">
Wikipedia says PMD has
a copy/paste detector
that works with C++:<br
class="">
<<a
moz-do-not-send="true"
href="http://en.wikipedia.org/wiki/PMD_%28software%29#Copy.2FPaste_Detector_.28CPD.29"
target="_blank"
class="">http://en.wikipedia.org/wiki/PMD_(software)#Copy.2FPaste_Detector_.28CPD.29</a>><br
class="">
<br class="">
"Note that CPD works
with Java, JSP, C,
C++, C#, Fortran and
PHP code.<br class="">
Your own language is
missing ? See how to
add it here"<br
class="">
<<a
moz-do-not-send="true"
href="http://pmd.sourceforge.net/snapshot/cpd-usage.html"
target="_blank"
class="">http://pmd.sourceforge.net/snapshot/cpd-usage.html</a>><br
class="">
<span class=""><font
class=""
color="#888888">--<br
class="">
"The Direct3D
Graphics Pipeline"
free book <<a
moz-do-not-send="true"
href="http://tinyurl.com/d3d-pipeline" target="_blank" class="">http://tinyurl.com/d3d-pipeline</a>><br
class="">
The Computer
Graphics Museum
<<a
moz-do-not-send="true"
href="http://computergraphicsmuseum.org/" target="_blank" class="">http://ComputerGraphicsMuseum.org</a>><br
class="">
The
Terminals Wiki
<<a
moz-do-not-send="true"
href="http://terminals.classiccmp.org/" target="_blank" class="">http://terminals.classiccmp.org</a>><br
class="">
Legalize
Adulthood! (my
blog) <<a
moz-do-not-send="true"
href="http://legalizeadulthood.wordpress.com/" target="_blank" class="">http://LegalizeAdulthood.wordpress.com</a>><br
class="">
</font></span>
<div class="">
<div class="">_______________________________________________<br
class="">
cfe-dev mailing
list<br class="">
<a
moz-do-not-send="true"
href="mailto:cfe-dev@cs.uiuc.edu" target="_blank" class="">cfe-dev@cs.uiuc.edu</a><br
class="">
<a
moz-do-not-send="true"
href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank"
class="">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br
class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
<br class="">
<fieldset class=""></fieldset>
<br class="">
<pre class="">_______________________________________________
cfe-dev mailing list
<a moz-do-not-send="true" href="mailto:cfe-dev@cs.uiuc.edu" target="_blank" class="">cfe-dev@cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a>
</pre>
</blockquote>
<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
</blockquote>
<br class="">
<br class="">
<pre class="moz-signature" cols="72">--
--------------------------------------------
Q: Why is this email five sentences or less?
A: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://five.sentenc.es/">http://five.sentenc.es</a>
</pre>
</div>
_______________________________________________<br
class="">
cfe-dev mailing list<br class="">
<a moz-do-not-send="true"
href="mailto:cfe-dev@cs.uiuc.edu" class="">cfe-dev@cs.uiuc.edu</a><br
class="">
<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br
class="">
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
<br class="">
</blockquote>
<br>
<br>
</body>
</html>