[cfe-users] C++ Scoring Tool

Daniel via cfe-users cfe-users at lists.llvm.org
Tue Apr 17 00:40:52 PDT 2018


For the senior project in my undergraduate studies, my team and I are 
developing a tool that will evaluate the format and code conventions of 
a c++ program, outputting a score and displaying useful messages, very 
much like pylint for python.

The idea is kind of like clang-format except no alterations to the code 
should be made. The tool would be used as a teaching aid and automatic 
grader. To handle the beautiful diversity of c++, it shouldn't constrain 
the author to any particular style (although it should be able to do 
that too). For example: open curly braces on same line as function 
declaration compared to having them on a new line. In this case, the 
tool could check for consistency only. As long as the entire file has 
the same format, you will get a perfect score. If, however, there are 10 
places of braces on same line and 9 on newline, there will be a penalty 
to the score, larger than if 18 on same line and 1 on newline. The idea 
is to enforce consistency without getting in the way of authors 
preferred style. This should give professors a robust tool to teach c++.

I was hoping the clang community could help me understand the inner 
workings of clang a little bit better. Right now, my hangup is trying to 
get format data to work in conjunction with clangs AST. What I'm trying 
to do is get back the whitespace, comment, and bracket information that 
is loss during AST buildup. Suppose I want to check that all operators 
have consistent spacing format, something like "(2 * 2)" verses "(2*2)" 
verses "(2* 2)". The AST will be used to get the semantics of that 
particular operator so as to not get it confused with the array pointer 
operator, but I need to count the operator whitespace prefix and 
postfix. The same concept will be applied to statement whitespace 
circumfixs. If done right, I should be able to refer to all operators 
the same way no matter the complexity of the expression. Something like 
"(x - 4) / 3 * (2 +1)" would show an inconsistency in the end part "(2 
+1)" because of a missing space.

My first thought was to use the SourceManager locational information to 
point back to the source code, then process and identify the whitespace 
from there; However, this seems wildly inefficient and inelegant. My 
second thought was to somehow get clang to keep the whitespace 
information and add it to the AST, but I believe there are inherent 
difficulties with that.

My biggest problem is lack of expertise within clangs source code. Does 
anybody have any ideas on how I can get clang to give me the information 
I need to support the above functionality?

Thanks for any interest. I hope this is an appropriate mailing list to 
post my question.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-users/attachments/20180417/3b39f0d1/attachment.html>

More information about the cfe-users mailing list