[llvm-dev] RFC: Reconsidering adding gmock to LLVM's unittest utilities

Wed Jan 4 22:16:23 PST 2017

No strong opinion, but certainly not opposed.  If it makes testing pass 
manager changes easier, SGTM.

Philip

On 01/04/2017 06:11 AM, Chandler Carruth via llvm-dev wrote:
> A long time ago I suggested that we might want to add gmock to 
> compliment the facilities provided by gtest in LLVM's unittests. It 
> didn't go over well:
>
> 1) There was concern over the benefit vs. the cost
> 2) Also concern about what the facilities would look like in practice 
> and whether they would actually help
> 3) At the time, I didn't have good, large examples of what these 
> things might look like or why they might be attractive
> 4) I didn't provide any real explanation of what gmock *did* and so it 
> was vague and unclear.
>
> Since then, a lot has changed. We have more heavy use of unit testing 
> in the project with more developers finding benefit from it. And I 
> think I have compelling examples.
>
> ## Matchers
>
> To start off, it is important to understand that there are two 
> components to what gmock offers. The first has very little to do with 
> "mocks". It is actually a matcher language and system for writing test 
> predicates:
>
>   EXPECT_EQ(expected, actual);
>   EXPECT_NE(something, something);
>
> Become instead:
>
>   EXPECT_THAT(actual, Eq(expected));
>   EXPECT_THAT(actual, Ne(not-expected));
>
> This pattern moves the *matcher* out of the *macro*, giving it a 
> proper C++ API. With that, we get two huge benefits: extensibility and 
> composability. You can easily write a matcher that summarizes 
> concisely the expectation for custom data types. And you can compose 
> these matchers in powerful ways. I'll give one example here:
>
>   EXPECT_THAT(MyDenseMap, UnorderedElementsAre(Eq(key1, value1), 
> Eq(key2, value2), Eq(key3, value3)));
>
> Here I'm composing equality matchers inside a matcher that can handle 
> *unordered* container element-wise comparison for generic, arbitrary 
> containers. With a small patch, I've even extended it to support 
> arbitrary iterator ranges! Combine this with custom matchers for the 
> elements, and it becomes a very expressive an declarative way to write 
> expectations in tests.
>
> I wanted to give a realistic and compelling example so I rewrote an 
> entire test: https://reviews.llvm.org/D28290 Note that I moved *every* 
> EXPECT to the new syntax so this is essentially worst-case. It also 
> involves a non-trivial custom matcher. Despite this, the code is 
> shorter, easier to read and easier to maintain. It has fewer 
> unnecessary orderings enforced. And it is much easier to extend. Also, 
> the error messages when it fails are substantially improved because 
> these composed matchers have logic to carefully explain *why* they 
> failed to match.
>
> I hope folks find this compelling. I think this alone is worth 
> carrying the gmock code in tree -- it is just used by tests and not 
> substantially larger than gtest. Even if we decide we want nothing to 
> do with mocks, I would very much like to have the matchers.
>
>
> ## Mocks
>
> So, now let's consider mocks. First off, what are mocks? I'll give a 
> fairly casual definition here: they are test objects which implement 
> some API and allow the test to explicitly set expectations on how that 
> API is used and how it in turn should behave. For a more detailed 
> vocabulary see [1] and for a more lengthy discussion see [2].
>
> As came up in the original discussion, LLVM relatively infrequently 
> has a need to test API interactions in this way. Usually we're in the 
> business of translating things from format A to B (instructions, 
> metadata, whatever) and can write down one format and write checks 
> against the other format for tests. This is a wonderful world to live 
> in with tests. I never want LLVM to *decrease* how much we leverage this.
>
> But we *do* have API interactions that we need to test. We have plugin 
> APIs, and hookable interfaces, ranging from Clang frontend actions to 
> JIT listeners. We also have *generic* code in ADT that is all about 
> API interactions. Most generic code in fact is -- we want it to work 
> for *any* T that behaves in a certain way, so we need to give it 
> interesting Ts to test it.
>
> My immediate example is the pass manager. We plug in a bunch of passes 
> to it, and expect it to run them in a precise way over specific bits 
> of IR. When testing this, it is extremely cumbersome to write a test 
> pass which does this in interesting and yet controllable and 
> comprehensible ways. Let's look at a concrete example:
>
> https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L481-L509
>
> Here we have over 20 lines of code spent testing that the correct set 
> of things happened the correct number of times. I had to write a long 
> comment just to explain what these numbers mean. And I still never 
> understand whether a change in the numbers really means a good or bad 
> thing.
>
> Now, we *have* detailed logging based tests use FileCheck which is the 
> primary way to avoid this in LLVM. But it isn't enough. In these tests 
> we want to carefully *permute* the behavior of very specific runs of 
> individual passes. A simple example of this can be seen here where we 
> have somewhat magical state in a pass to flip-flop its behavior:
> https://github.com/llvm-project/llvm-project/blob/master/llvm/unittests/IR/PassManagerTest.cpp#L138-L139
>
> And it gets more complicated if you want statefulness like triggering 
> on the *3rd* run of the pass.
>
> But this is exactly the kinds of scenarios that I needed to write 
> tests for in order to get the code to be correct. I have consistently 
> found and been able to fix bugs throughout the pass manager by writing 
> careful unittests.
>
> Mocks with GoogleMock are, IMO, a *tool to create interesting and 
> debuggable test objects*. These objects can then be fed into an API to 
> exercise it in ways that are hard or impossible to control from a 
> command line in sufficient granularity and precision. While doing this 
> is never fun and should be avoided where possible, when we need to do 
> this I think it provides a powerful tool for the job.
>
> Here is how it works at the highest level:
> 1) Create a class with a MOCK_METHOD*(...) API. This API is then 
> hookable by gmock.
> 2) Use some APIs to register default behaviors for the APIs.
> 3) Setup the *minimal* amount of expected API interactions for a given 
> test. IE, for this test to pass, X has to happen and in response to 
> that my code needs to do Y.
> 4) Feed this class, or a wrapper around it if you need a copyable 
> object, into the system you are testing and run it.
>
> If the expected interactions don't occur, you get a trace of which 
> ones failed and why. These traces are somewhat verbose and hard to 
> read, but they actually have the information needed to debug the 
> system which saves you from building infrastructure to extract that 
> over and over again.
>
> But a concrete example will likely work better. I've used gmock to 
> build the unit tests for a major revision of the LoopPassManager in 
> the new pass manager. This is a substantial redesign that now handles 
> inserting new loops, deleting loops, and invalidating analyses. The 
> tests for it are, IMO, dramatically more readable than the test linked 
> above. And they are substantially more thorough and precise:
>
> https://reviews.llvm.org/D28292
>
> I hope this is compelling for folks. Just writing and debugging this 
> one test was extremely compelling for me. I ended up with much better 
> coverage and precision than I would have using any other technique 
> without a tremendous amount of plumbing essentially re-inventing a 
> framework to build test pass objects that work exactly the way these 
> mock pass handles do.
>
> That said, all is not perfect. For instance, gmock suffers from being 
> designed in  C++98 world. It has relatively poor support for move and 
> value semantics, which resulted in my using a wrapper around the mock 
> interfaces in the above patch to let a pimpl idiom provide the value 
> semantics I wanted. However, that idiom works well, and this didn't 
> substantially impede my use of the infrastructure.
>
> Also, I remain very sympathetic to the idea that this kind of testing 
> apparatus should be relatively rarely needed. We shouldn't be writing 
> new complex unit tests for APIs every week. But even a few use cases 
> such as to test ADTs and generic tools like the pass manager seem to 
> justify the cost to me, and I'm happy to help draw up fairly 
> restrictive guidance around mocks for the coding standards.
>
>
> Thanks, and sorry for the long email, but I wanted to try and lay out 
> the issues in a way folks could understand, and the examples, while 
> hopefully useful, are quite large and complex.
>
> Please don't hesitate to ask questions if stuff isn't clear.
> -Chandler
>
> [1]: https://en.wikipedia.org/wiki/Test_double
> [2]: http://martinfowler.com/articles/mocksArentStubs.html
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170104/aa4f2b4d/attachment.html>