[LLVMdev] [RFC] Project for GSoC: Unit/Regression testing for CodeGen

Fri Mar 6 08:09:29 PST 2015

On Fri, Mar 6, 2015 at 7:04 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> Hi everyone,
>
> In response to yet-another fix in CodeGen affecting only an out-of-tree target (r231186), our lack of the ability to properly unit test CodeGen components has been highlighted. It was suggested that improving this situation might be a good GSoC project, and I agree, provided that we can settle on the scope and basic design ahead of time.
>
> I'd like to add that I feel this is a serious problem even for in-tree targets. We currently construct IR-level tests for CodeGen components, but
> this is very fragile. Many of the IR-level CodeGen tests, especially "bug-triggering" regression tests, don't currently test the logic they were originally designed to cover.

Yes, yes!  Doing something - anything - in the area would be great =)

> Now, for a design:
>
> One idea that I've had for some time is to develop a 'mock' target for testing. For this target, all of the various type/operation legality settings would be determined by some input configuration file. It would contain instructions, mostly in 1:1 correspondence to our SelectionDAG node types, and many different register classes of different sizes, different calling-conventions, etc. (again, some input configuration file would determine which were active). We could then use this mock target to right regression tests for CodeGen components. We could also use it write units tests, especially at the MI level.

This is arguably a different and bigger issue ("Target" vs "CodeGen"),
but that only helps for generic (lib/CodeGen) bugs, not for
target-specific ones (I'm thinking ISel, or stuff like AnalyzeBranch),
right?

For the latter, serializing MI (or more importantly the SelectionDAG)
has been floating around for a while.
I think that would help a lot, but once you start serializing an
internal representation, you don't know if it's still possible to get
to it from IR, so you have the same staleness problem we currently
have.
That's solved by adding a companion tests, that's written like we do
now (from IR), checking the serialized representation right before the
to-be-tested component.  When someone makes a change that affects this
companion, they have an opportunity to re-evaluate the unit test as
well.  A bit verbose, yes, but does it sound sensible?

Don't get me wrong, a mock target would be pretty simple and very
useful - enough for lib/CodeGen, so +1 for that by itself.

-Ahmed

> Thoughts?
>
>  -Hal
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev