[PATCH] D43962: [GlobalISel][utils] Adding the init version of Instruction Select Testgen

Thu Mar 1 12:38:55 PST 2018

rtereshin created this revision.
rtereshin added reviewers: qcolombet, ab, dsanders, aditya_nandakumar, bogner, volkan, aemerson, t.p.northover, rovka.
Herald added subscribers: mgrang, kristof.beyls, mgorny.

This is the first version of the testgen - a tool, currently implemented as an
llc MIR-pass, that generates regression lit-tests for GlobalISel's Instruction
Selector. The generation is done on rule-by-rule basis and currently covers
selection rules automatically imported by TableGen from SelectionDAGISel.

What this tool is and isn't:

1. This is not a fuzzer for Instruction Selector, it isn't trying to come up with a malicious input for it and break it, nor it's trying to discover bugs in it.

2. This is a regression testing tool, it's main goal is to capture the current state of the GlobalISel's InstructionSelect pass providing the best test coverage for it with small and highly targeted tests that pass, and catch any regressions later due to changes in: 2.1 *.td-definitions of the instructions and selection patterns; 2.2 GlobalISel's emitter (the TableGen backend), including the ones that intend to change rules' priorities and the ones that don't; 2.3 manually written parts of the Instruction Selector.

3. Potentially this is also an analysis tool that may make it easier to see and control the actual effects of changes like listed above on the Selector, detect dead rules, etc.

4. It may be extended in the future to generate tests for other passes of the GlobalISel's pipeline, and / or have a fuzzer mode of operation, but currently these aren't the goals.

Potential user stories:

1. New backend development.
2. Porting an existing backend from SelectionDAG ISel to GlobalISel.

While the first one is promising, it appears that the second one is more
prominent right now and therefore the main target of this tool.

Design goals:

1. As we mostly care about providing regression testing of InstructionSelect pass of GlobalISel's implementations early in the development for pre-existing targets, we can not rely on any other GlobalISel passes being well-developed and fully functional, in particular, we expect InstructionSelect pass to be well ahead everything else due to the semi-automatic porting mechanism.

  See https://reviews.llvm.org/rL326396 as an example of breaking ties with the Legalizer, selectUnconstrainedRegBanks of this patch as an example of the same w.r.t. RegBankSelect.

2. We want the testgen to be relatively robust and able to handle gracefully non-functional changes, for instance, changes in the typical order of the MatchTable opcodes for rules, or even presense of specific opcodes, like the number of operands check, or changes in concrete serialization format for MIR.

3. We want the testgen to be as target-independent and generic as possible and impose as less maintainance burden on backend writers as possible.

4. If it's not jeopardizing other goals and not too difficult to do, we want testgen to generate naturally-looking tests that are likely to come out the same if written by a human.

Design decisions made:

1. Current implementation of testgen uses TableGen'erated MatchTable's to generate the tests. We could've branched off input data-wise earlier, but that would mean re-implementing too much of the GlobalISel's emitter.

2. We're using only matching parts of the MatchTable to generate MIR and relying on the selector itself to generate FileCheck's for the expected output for a few major reasons: 2.1 it simplifies the implementation; 2.2 it reduces the number of tests failing as of time of their generation due to the MIR being selected not by a rule intended, which is desirable as we aren't fuzzing the selector, but trying to generate passing tests;

Usage:

llc -mtriple aarch64-- -run-pass instruction-select-testgen -simplify-mir input.mir -o output.mir

will add a number of Machine Function's, one per every imported *.td-defined
selection rule, into intput.mir and write the result as output.mir.

Command line options:

1. -testgen-from-rule=N -testgen-until-rule=M - generate tests for a subrange of rules only;

2. -testgen-exclude-rules=N{,N} - skip specific rules;

3. -testgen-include-only=N{,N} - generate tests for explicitly listed rules only;

4. -testgen-set-all-features - speculatively satisfy all target / module / and function features requirements to cover feature-specific rules;

5. -testgen-no-abi - don't speculate on ABI boundaries tring to make the test look natural and test COPY's selection, but just IMPLICIT_DEF undefined vregs instead.

Note:
-testgen-no-abi=false tried to emit real RET opcodes at some point by using
CallLowering::lowerReturn and deriving IR Types from LLTs, but it proved
to be unreliable for most targets and created an extra dependency.

This patch also provides utils/update_instruction_select_testgen_tests.sh tool
that would generate a couple of lit-tests:

usage: ./utils/update_instruction_select_testgen_tests.sh <testgen'd file> <llc binary> <target triple> [extra llc args]

for instance, executing

../../utils/update_instruction_select_testgen_tests.sh ../../test/CodeGen/AArch64/GlobalISel/arm64-instruction-select-testgen-testgend.mir ./bin/llc aarch64--

from a build/obj directory would create 2 files:

../../test/CodeGen/AArch64/GlobalISel/arm64-instruction-select-testgen-testgend.mir
and
../../test/CodeGen/AArch64/GlobalISel/arm64-instruction-select-testgen-selected.mir

testing that the testgen outputs the same MIR and the selector selects that MIR
the same way respectively.

Coverage:

Target  | Rules    | Fail to | Tests     | Selected by the

  | Imported | Select  | Generated | Rule Intended

--------+----------+---------+-----------+----------------
AArch64 |  1654    |  0.0%   |  1449     |  85%
ARM     |  1055    |  0.2%   |   991     |  78%
x86     |   887    | 13.8%   |   765     |  68%

"Fail to Select" stands for "a generated test crashed / asserted the selector",
this is something to -testgen-exclude-rules in practice. The major reason
for this right now is a limited support of COPY_TO_REGCLASS in *.td-defined
patterns by the GlobalISel importing mechanism.

"Selected by the Rule Intended" basically means the target coverage provided by
the tool. A test could be selected by a rule different from the rule that was
used to generate it for a variety of reasons, approximately in order from most
prominent ones to the rarest ones:

1. The test generated isn't specific enough due to: 1.1 lack of support of complex patterns by the testgen; 1.2 too basic support of immediate predicates by the testgen; 1.3 rules genuinely intersecting with each other and local approach of the testgen not considering rules partially hiding each other.

2. A rule is genuinely dead and 2.1 it was rendered dead by GlobalISel; 2.2 it was dead in SelectionDAG ISel to beging with; 2.3 it is rendered dead by manually written parts of the selector executing before trying TableGen'erated selectImpl.

Known deficiencies:

1. Testgen could not be currently easily extended by a target to support complex patterns, which should greatly improve coverage.

2. Testgen's way of dealing with features is very sketchy at the moment and needs to improved.

3. Testgen should probably be a separate from llc binary tool

approximately from the most important to fix soon to the least important.

A couple of dependencies for this patch as well as the tests generated
are coming soon in separate patches.

See also test/CodeGen/AArch64/GlobalISel/select-with-no-legality-check.mir
currently committed for an example output of the testgen.

Repository:
  rL LLVM

https://reviews.llvm.org/D43962

Files:
  include/llvm/CodeGen/GlobalISel/InstructionSelectTestgen.h
  include/llvm/CodeGen/GlobalISel/InstructionSelector.h
  include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h
  include/llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h
  include/llvm/InitializePasses.h
  include/llvm/Support/CodeGenCoverage.h
  lib/CodeGen/GlobalISel/CMakeLists.txt
  lib/CodeGen/GlobalISel/GlobalISel.cpp
  lib/CodeGen/GlobalISel/InstructionSelect.cpp
  lib/CodeGen/GlobalISel/InstructionSelectTestgen.cpp
  lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp
  lib/Support/CodeGenCoverage.cpp
  test/TableGen/GlobalISelEmitter.td
  utils/TableGen/GlobalISelEmitter.cpp
  utils/update_instruction_select_testgen_tests.sh

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D43962.136585.patch
Type: text/x-patch
Size: 68368 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180301/a820cd8c/attachment.bin>