[cfe-dev] Using clang static analyser / clang-tidy for assignments grading?

Sat Feb 29 11:11:54 PST 2020

Hi Nathan! How're you?
Thanks for your kind reply!

I started to dive into Clang and while it does require a learning curve, I
think that it'll be perfect.

About manual grading: we do have a manual grading process, but we do want
to automate simple tasks as checking if a forbidden function was used, as
it's really easy to miss it when grading an exercise. Of course, when the
system will report on such things, we will check it manually as well, but
the first check, ideally, be automatic.

My main wonder now is if I should write a clang-tidy library, or create an
entirely new tool, from scratch, using Clang and Libtooling. I'd love to
hear your thoughts: On the one hand, clang-tidy is very mature, got modules
support and even contain some tests that we need. But on the other hand, I
have a few concerns that I'm worried about:

   - We want to perform actions with the diagnostic results. For example,
   if a student used a public class variable, we want to reduce points on
   that. Our faculty has an internal file format that you can use to
   automatically reduce (or add) points from a student. So, we need some kind
   of a hook that allows us to get a report from
   "NonPrivateMemberVariablesInClassesCheck" so that we can write on that file
   at that point.
   As far as I understand, clang-tidy only allows us to print the
   diagnostics to the screen. Is there any way I can achieve that behaviour
   using clang-tidy?
   - clang-tidy allows disabling linting using "// NOLINT" comment. We want
   to disable this feature, so that students won't be able to cancel our
   checks. If I understand correctly, it can be done using CommentHandler but
   I want to make sure, as otherwise, we won't be able to use clang-tidy.
   - We want to perform an API comparison between the expected API
   definition (provided as JSON/YAML etc.) and the actual implementation API.
   Based on that, we want to write a new .h file that contains macro for every
   implemented function and feature. For example, if we asked the students to
   implement HashMap<K, V>::at(const K&), we will write on the output .h file
   a macro named HASHMAP_METHOD_AT if the method exists, and in addition, we
   will write on it HASHMAP_METHOD_AT_RETURN_LVALUE if the function returns an
   lvalue. We need that as we write some C++ tests to check the student's
   code, and we noticed that they sometimes mistake between lvalue and rvalue,
   so then our tests won't compile and everything falls for them. In addition,
   sometimes they even forget to implement some methods (for example, in the
   HashMap exercise we gave them, some forgot to implement the entire set of
   methods iterator requires) - and that killed the compilation process as
   well.

   That's just a concept idea as I'm not sure how good it'll get
   implemented, but our end goal is to be able to write C++ tests that won't
   get compilation errors if the students didn't implement correctly the
   entire API.

So my question is, do you think that clang-tidy is suitable for such
things? or I should write a new Clang and Libtooling based tool that can do
that?
I really appreciate your time and efforts to help us!

Best regards,
Yahav.

On Sat, Feb 29, 2020 at 4:57 AM Nathan James <n.james93 at hotmail.co.uk>
wrote:

> Hi Yahav,
>
> It is certainly possible using the static analyser and clang tidy
> frameworks to implement most of that checking. In clang tidy there are
> dedicated checks for some of the restrictions you need. Others would be
> vary easy to write AST matches or preprocessor callbacks for.
>
> However it's my personal opinion that this should not be used as a
> replacement for a human grading but more as an assistant. When patches
> are submitted for review here we have automatic checks that check validity
> and can reject obvious flaws but it is always up to a reviewer to have a
> look over before a go ahead is given.
>
> If you do decide to go down the route of using clang to help out then
> definitely check out the code in the clang-tidy checks(static analyser not
> so much), have a look at the AST matches docs
> https://clang.llvm.org/docs/LibASTMatchersReference.html and have a play
> with clang-query.
>
> Kind regards,
> Nathan James.
>
> ------------------------------
> *From:* cfe-dev <cfe-dev-bounces at lists.llvm.org> on behalf of Yahav Bar
> via cfe-dev <cfe-dev at lists.llvm.org>
> *Sent:* Friday, 28 February 2020, 16:29
> *To:* cfe-dev at lists.llvm.org
> *Subject:* [cfe-dev] Using clang static analyser / clang-tidy for
> assignments grading?
>
> Hi everyone!
> I hope that I'm querying the correct mailing list about my question.
>
> I work as a teaching assistant at the Hebrew University, teaching C and
> C++. As part of our course, we ask the students to submit C and C++
> exercises which we grade (both manually and automatically).
>
> When grading students exercises, we check the validity of their code. For
> example,
>
>    - We check if the students didn't forget to use include safe-guard;
>    - If the students used non-safe functions, which we consider forbidden
>    to use (in addition, sometimes we explicitly tell students not to use set
>    of predefined functions or C++ classes, as the exercises ask them to
>    implement this set of functions).
>    - If the students didn't include a forbidden header;
>    - If the students didn't use a #pragma statement to bypass our
>    compilation instructions;
>    - C++: If the students remembered to use "const" when required, and to
>    pass parameters by reference when needed;
>    - C++: that the students returned lvalue when needed and rvalue when
>    needed etc.
>    - C++: When writing an iterator, if it was implemented correctly
>    (a.k.a, according to Input/Output/Forward/Bidirectional/Random Access
>    iterator rules + using iterator traits).
>
> Right now these tests are being done by a human. As our classes formed
> from 300 to 600 students (next semester we'll have 650...) it'll be really
> hard and non-efficient to do it by hand. Thus I thought it might be a good
> idea to automate these checks too.
>
> Initially, I thought to write that with Python using ANTLr (creating an
> AST for both the Preprocessing stage and the C stage and just iterating
> over them), but at the middle of programming, I came across the Static
> Analyser API of Clang and thought I should switch to it, as it seems very
> mature and well fit for our needs.
>
> Before diving deeply into clang, I'd love to hear from you, who have
> experience in clang and LLVM dev, if I'm on the right track, and can
> actually achieve my goal using clang, or I should stick with my previous
> attempt. Our end goal is a program that we can run, send the path to the
> student exercise, and get the errors that she had so that we can reduce
> points accordingly.
>
> Thank you very much!
> Yahav.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200229/ae301299/attachment-0001.html>