<div dir="ltr">Initial gut reaction would be this is perhaps a big enough patch/divergence from upstream gtest that it should go into upstream gtest first & maybe sync'ing up with a more recent gtest into LLVM? Though I realize that's a bit of a big undertaking (either/both of those steps). How does this compare to other local patches to gtest we have?</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Feb 26, 2021 at 10:47 AM via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This note describes the first part of the Rotten Green Tests project.<br>

<br>

"Rotten Green Tests" is the title of a paper presented at the 2019<br>

International Conference on Software Engineering (ICSE).  Stripped to<br>

its essentials, the paper describes a method to identify defects or<br>

oversights in executable tests.  The method has two steps:<br>

<br>

(a) Statically identify all "test assertions" in the test program.<br>

(b) Dynamically determine whether these assertions are actually<br>

executed.<br>

<br>

A test assertion that has been coded but is not executed is termed a<br>

"rotten green" test, because it allows the test to be green (i.e.,<br>

pass) without actually enforcing the assertion.  In many cases it is<br>

not immediately obvious, just by reading the code, that the test has a<br>

problem; the Rotten Green Test method helps identify these.<br>

<br>

The paper describes using this method on projects coded in Pharo<br>

(which appears to be a Smalltalk descendant) and so the specific tools<br>

are obviously not applicable to a C++ project such as LLVM.  However,<br>

the concept can be easily transferred.<br>

<br>

I applied these ideas to the Clang and LLVM unittests, because these<br>

are all executable tests that use the googletest infrastructure.  In<br>

particular, all "test assertions" are easily identified because they<br>

make use of macros defined by googletest; by modifying these macros,<br>

it should be feasible to keep track of all assertions, and report<br>

whether they have been executed.<br>

<br>

The mildly gory details can be saved for the code review and of course<br>

an LLVM Dev Meeting talk, but the basic idea is: Each test-assertion<br>

macro will statically allocate a struct identifying the source<br>

location of the macro, and have an executable statement recording<br>

whether that assertion has been executed.  Then, when the test exits,<br>

we look for any of these records that haven't been executed, and<br>

report them.<br>

<br>

I've gotten this to work in three environments so far:<br>

1) Linux, with gcc as the build compiler<br>

2) Linux, with clang as the build compiler<br>

3) Windows, with MSVC as the build compiler<br>

<br>

The results are not identical across the three environments.  Besides<br>

the obvious case that some tests simply don't operate on both Linux<br>

and Windows, there are some subtleties that cause the infrastructure<br>

to work less well with gcc than with clang.<br>

<br>

The infrastructure depends on certain practices in coding the tests.<br>

<br>

First and foremost, it depends on tests being coded to use the<br>

googletest macros (EXPECT_* and ASSERT*) to express individual test<br>

assertions.  This is generally true in the unittests, although not as<br>

universal as might be hoped; ClangRenameTests, for example, buries a<br>

handful of test assertions inside helper methods, which is a plausible<br>

coding tactic but makes the RGT infrastructure less useful (because<br>

many separate tests funnel through the same EXPECT/ASSERT macros, and<br>

so RGT can't discern whether any of those higher-level tests are<br>

rotten).<br>

<br>

Secondly, "source location" is constrained to filename and line number<br>

(__FILE__ and __LINE__), therefore we can have at most one assertion<br>

per source line.  This is generally not a problem, although I did need<br>

to recode one test that used macros to generate assertions (replacing<br>

it with a template).  In certain cases it also means gcc doesn't let<br>

us distinguish multiple assertions, mainly in nested macros, for an<br>

obscure reason.  But those situations are not very common.<br>

<br>

There are a noticeable number of false positives, with two primary<br>

sources. One is that googletest has a way to mark a test as DISABLED;<br>

this test is still compiled, although never run, and all its<br>

assertions will therefore show up as rotten.  The other is due to the<br>

common LLVM practice of making environmental decisions at runtime<br>

rather than compile time; for example, using something like 'if<br>

(isWindows())' rather than an #ifdef.  I've recoded some of the easier<br>

cases to use #ifdef, in order to reduce the noise.<br>

<br>

Some of the noise appears to be irreduceable, which means if we don't<br>

want to have bots constantly red, we have to have the RGT reporting<br>

off by default.<br>

<br>

Well... actually... it is ON by default; however, I turn it off in<br>

lit.  So, if you run `check-llvm` or use `llvm-lit` to run unittests,<br>

they won't report rotten green tests.  However, if you run a program<br>

directly, it will report them (and cause the test program to exit with<br>

a failure status).  This seemed like a reasonable balance that would<br>

make RGT useful while developing a test, without interfering with<br>

automation.<br>

<br>

<br>

The overall results are quite satisfying; there are many true<br>

positives, generally representing coding errors within the tests.<br>

A half-dozen of the unittests have been fixed, with more to come, and<br>

the RGT patch itself is at: <a href="https://reviews.llvm.org/D97566" rel="noreferrer" target="_blank">https://reviews.llvm.org/D97566</a><br>

<br>

Thanks,<br>

--paulr<br>

_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</blockquote></div>