[Mlir-commits] [mlir] [mlir][arith] Add more canonicalization and integration tests coverage (PR #92272)

Thu Aug 29 12:56:14 PDT 2024

banach-space wrote:

>  I think this is where we differ: the main way of testing LLVM (and MLIR) is with unit-tests checking the lowering through IR matching, not integration tests. These are more exceptional.

I agree with Mehdi.

We (Arm folks working on SVE and SME support) tend to use e2e tests for rather complex examples (e.g. `linalg.matmul`) - but even those are mostly used as smoke tests. e2e tests are super helpful when evaluating complex compilation pipelines, but that's "integration" testing and should not be treated as a replacement for "unit" testing. That's how I look at the tests being added here.

> On motivation: some of the existing tests do indeed test existing/historical bugs in MLIR

>From my experience, such tests/bugs are symptoms of insufficient _unit_ test coverage. But rather than using reproducers as is (these tend to be relatively complex), we should try to identify the corresponding edge case that has been missed and test that explicitly (it's much easier to reason about a test once the underlying edge case is clear). 

> As for runtime costs of these tests, I did some tests earlier today (personal laptop, default options, all integration tests), and altogether, the new tests add ~1 second to the total running time (263.67s with all new tests, 262.87s without, average taken over 3 runs).

This is re-assuring, thanks for checking. Still, we should make sure that we don't encourage folks to add too many new cases. My initial steer was to test all data types, but now I realise that we should avoid that.

> For instance, it's challenging (at least for me) to look at LLVM bytecode and determine with certainty that it will perform as intended for all inputs.

Please note that we will never run tests for all possible inputs ;-)

Just to summarise, I do find these new e2e tests helpful, but lets limit the scope and focus on a small set of interesting cases. @pingshiyu , what other Ops do you intend to test? Once you finish, we could consolidate all tests into one file. We can use [correctnes.mlir](https://github.com/llvm/llvm-project/blob/main/mlir/test/Integration/Dialect/Complex/CPU/correctness.mlir) as an indication (roughly) of what the scope should be.

https://github.com/llvm/llvm-project/pull/92272