[llvm-dev] [RFC] Compiled regression tests.

Wed Jul 1 12:33:46 PDT 2020

Am Mi., 1. Juli 2020 um 12:24 Uhr schrieb Robinson, Paul <
paul.robinson at sony.com>:
>
> What I actually meant re. CHECK-DAG is to take this
>
>     72 ; CHECK: ![[ACCESS_GROUP_LIST_3]] =
!{![[ACCESS_GROUP_INNER:[0-9]+]], ![[ACCESS_GROUP_OUTER:[0-9]+]]}
>     73 ; CHECK: ![[ACCESS_GROUP_INNER]] = distinct !{}
>     74 ; CHECK: ![[ACCESS_GROUP_OUTER]] = distinct !{}
>
> and turn it into this
>
>     72 ; CHECK: ![[ACCESS_GROUP_LIST_3]] =
!{![[ACCESS_GROUP_INNER:[0-9]+]], ![[ACCESS_GROUP_OUTER:[0-9]+]]}
>     73 ; CHECK-DAG: ![[ACCESS_GROUP_INNER]] = distinct !{}
>     74 ; CHECK-DAG: ![[ACCESS_GROUP_OUTER]] = distinct !{}

Note that `CHECK-DAG: !{{[0-9]+}} = distinct !{}` could match multiple
lines, requiring us find ACCESS_GROUP_INNER/ACCESS_GROUP_OUTER beforehand,
making us check more than we may want to.

> except that I think I was misled by the “non-semantic” remark about the
change. Did you mean that the order of the INNER and OUTER elements (line
72) has been swapped?  That sounds semantic, as far as the structure of the
metadata is concerned!

Maybe we should clarify what I mean with "non-semantic":

Textual change: Some character in a line change
Structural change: The graph of MDNodes changes
Semantic change: There is a difference in what the MDNode means; depends on
where the MDNode is used.

(I am not always consistent either)

The example patch is a structural, but non-semantic change: The order of
operands in one MDNode is changed, but semantically the change is
meaningless: To determine whether a loop is parallel, only containment in
the access group list matters, but not the order in the list. An
interesting property in the parallel-loop-md-merge.ll is that although
there is a structural change, there is no textual change of that node.
parallel-loop-md-merge.ll tries to test semantics using the text output.

> But okay, let’s call that a syntactic change, and a test relying on the
order of the parameters will break.  Which it did.  And the correct fix is
instead
>
>     72 ; CHECK: ![[ACCESS_GROUP_LIST_3]] =
!{![[ACCESS_GROUP_OUTER:[0-9]+]], ![[ACCESS_GROUP_INNER:[0-9]+]]}
>
> is it not?  To reflect the change in order?

The test should not break in the first place since the change is
insignificant.
Every spurious test fail adds to the burden of changing something, and it
is easy to break hundreds of regression tests.

> But let’s say I’m the one doing this presumably innocuous change, and
have no clue what I’m doing, and don’t know much about how FileCheck works
(which is pretty typical of the community, I admit).  You’ve shown issues
with trying to diagnose the FileCheck results.
>
> How would the compiled regression test fail?  How would it be easier to
identify and repair the issue?

It would not fail in the first place (admittedly this requires a
well-designed test which I think is easier to write using LLVM's API
directly).

If it does fail, it should be on semantic changes only. The called API
directly points to what it is checking, in this case isAnnotatedParallel():

  /// Returns true if the loop is annotated parallel.
  ///
  /// A parallel loop can be assumed to not contain any dependencies between
  /// iterations by the compiler. That is, any loop-carried dependency checking
  /// can be skipped completely when parallelizing the loop on the target
  /// machine. Thus, if the parallel loop information originates from the
  /// programmer, e.g. via the OpenMP parallel for pragma, it is the
  /// programmer's responsibility to ensure there are no loop-carried
  /// dependencies. The final execution order of the instructions across
  /// iterations is not guaranteed, thus, the end result might or might not
  /// implement actual concurrent execution of instructions across multiple
  /// iterations.
  bool isAnnotatedParallel() const;

which should help you determine whether a patch intends to have this effect
or isAnnotatedParallel() should be fixed. In this case, fixing
isAnnotatedParallel fixes the regression tests as well as LLVM itself.

> Re. how MIR makes testing easier: it is a serialization of the data that
a machine-IR pass operates on, which makes it feasible to feed canned MIR
through a single pass in llc and look at what exactly that pass did.  Prior
to MIR, we had to go from IR to real machine code and infer what was going
on in a pass after multiple levels of transformation had occurred.  It was
very black-box, and the black box was way too big.

mlir-opt ?

Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200701/ed239978/attachment.html>