[llvm-dev] [RFC] Compiled regression tests.

Mehdi AMINI via llvm-dev llvm-dev at lists.llvm.org
Wed Jul 1 13:22:17 PDT 2020


On Wed, Jul 1, 2020 at 7:34 AM Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
> On 7/1/20 12:40 AM, Michael Kruse via llvm-dev wrote:
>
> To illustrate some difficulties with FileCheck, lets make a non-semantic
> change in LLVM:
>
>     --- a/llvm/lib/Analysis/VectorUtils.cpp
>     +++ b/llvm/lib/Analysis/VectorUtils.cpp
>     @@ -642,8 +642,8 @@ MDNode *llvm::uniteAccessGroups(MDNode
> *AccGroups1, MDNode *AccGroups2) {
>          return AccGroups1;
>
>        SmallSetVector<Metadata *, 4> Union;
>     -  addToAccessGroupList(Union, AccGroups1);
>        addToAccessGroupList(Union, AccGroups2);
>     +  addToAccessGroupList(Union, AccGroups1);
>
>        if (Union.size() == 0)
>          return nullptr;
>
> This changes the order of access groups when merging them.
> Same background: Access groups are used to mark that a side-effect (e.g.
> memory access) does not limit the parallelization of a loop, added by e.g.
> #pragma clang loop vectorize(assume_safety). Therefore a loop can be
> assumed to be parallelizable without further dependency analysis if all its
> side-effect instructions are marked this way (and has no loop-carried
> scalar dependencies).
> An access group represents the instructions marked for a specific loop. A
> single instruction can be parallel with regrads to multiple loop, i.e.
> belong to multiple access groups. The order of the access groups is
> insignificant, i.e. whether either `AccGroups1` or `AccGroups2` comes
> first, the instruction is marked parallel for both loops.
>
> Even the change should have no effect on correctness, `ninja check-llvm`
> fails:
>
>     test/Transforms/Inline/parallel-loop-md-merge.ll
>
>      string not found in input
>     ; CHECK: ![[ACCESSES_INNER]] = !{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_INNER]]}
>              ^
>     <stdin>:45:1: note: scanning from here
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>     <stdin>:45:1: note: with "ACCESSES_INNER" equal to "7"
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>     <stdin>:45:1: note: with "ACCESS_GROUP_INNER" equal to "4"
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>
> The line FileCheck points to has nothing to do with the changed order of
> access groups.
> (what's the point of annotating the `!7 = ...` line repeatedly? Why not
> pointing to the lines where ACCESSES_INNER/ACCESS_GROUP_INNER have their
> values from?)
>
> The CHECK lines, annotated with their line numbers from
> parallel-loop-md-merge.ll, are:
>
>     68 ; CHECK: store double 4.200000e+01, {{.*}} !llvm.access.group
> ![[ACCESS_GROUP_LIST_3:[0-9]+]]
>     69 ; CHECK: br label %for.cond.i, !llvm.loop ![[LOOP_INNER:[0-9]+]]
>     70 ; CHECK: br label %for.cond, !llvm.loop ![[LOOP_OUTER:[0-9]+]]
>     71
>     72 ; CHECK: ![[ACCESS_GROUP_LIST_3]] =
> !{![[ACCESS_GROUP_INNER:[0-9]+]], ![[ACCESS_GROUP_OUTER:[0-9]+]]}
>     73 ; CHECK: ![[ACCESS_GROUP_INNER]] = distinct !{}
>     74 ; CHECK: ![[ACCESS_GROUP_OUTER]] = distinct !{}
>     75 ; CHECK: ![[LOOP_INNER]] = distinct !{![[LOOP_INNER]],
> ![[ACCESSES_INNER:[0-9]+]]}
>     76 ; CHECK: ![[ACCESSES_INNER]] = !{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_INNER]]}
>     77 ; CHECK: ![[LOOP_OUTER]] = distinct !{![[LOOP_OUTER]],
> ![[ACCESSES_OUTER:[0-9]+]]}
>     78 ; CHECK: ![[ACCESSES_OUTER]] = !{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_OUTER]]}
>
> And the output of `FileCheck --dump-input -v` is:
>
>          38: !0 = !{!1}
>          39: !1 = distinct !{!1, !2, !"callee: %A"}
>          40: !2 = distinct !{!2, !"callee"}
>          41: !3 = !{!4, !5}
> check:72     ^~~~~~~~~~~~~~
>          42: !4 = distinct !{}
> check:73     ^~~~~~~~~~~~~~~~~
>          43: !5 = distinct !{}
> check:74     ^~~~~~~~~~~~~~~~~
>          44: !6 = distinct !{!6, !7}
> check:75     ^~~~~~~~~~~~~~~~~~~~~~~
>          45: !7 = !{!"llvm.loop.parallel_accesses", !5}
> check:76     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match
> found
>          46: !8 = distinct !{!8, !9}
> check:76     ~~~~~~~~~~~~~~~~~~~~~~~
>          47: !9 = !{!"llvm.loop.parallel_accesses", !4}
> check:76     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> To fix this regression test, one needs to solve the following questions:
>
>    -  Is `!0 = !{!1}` supposed to match with anything?
>    -  Which of the CHECK lines are the ones that are important for the
>    regression tests?
>    -  The `![[SOMETHING]] = distinct !{}` line appears twice. Which one
>    is ACCESS_GROUP_INNER/ACCESS_GROUP_OUTER?
>    -  There are two lines with `llvm.loop.parallel_accesses` as well. The
>    author of this regression test was kind enough to give the placeholders
>    descriptive names, but do they match with what FileCheck matched?
>    -  LOOP_INNER and LOOP_OUTER are only distinguished by position and
>    `.i`. Why does the inner loop come first?
>    -  Before the patch that modifies the order, the output of
>    --dump-input=always was:
>
>      38: !0 = !{!1}
>      39: !1 = distinct !{!1, !2, !"callee: %A"}
>      40: !2 = distinct !{!2, !"callee"}
>      41: !3 = !{!4, !5}
>      42: !4 = distinct !{}
>      43: !5 = distinct !{}
>      44: !6 = distinct !{!6, !7}
>      45: !7 = !{!"llvm.loop.parallel_accesses", !4}
>      46: !8 = distinct !{!8, !9}
>      47: !9 = !{!"llvm.loop.parallel_accesses", !5}
>
> This seems to suggest that our change caused !4 and !5 in lines 45 and 47
> to swap. This is not what the patch did, which changes to order of two
> MDNode arguments. The only MDNode that has two other MDNodes as operands is
> line (line !6 and !8 are representing loops as they reference themselves as
> first arguments; noalias scopes do this as well, but there is no noalias
> metadata in this example).
>
> An hasty fix is to change CHECK lines 76 and 77 to
>
>     76 ; CHECK: ![[ACCESSES_INNER]] = !{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_OUTER]]}
>     78 ; CHECK: ![[ACCESSES_OUTER]] = !{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_INNER]]}
>
> However, ACCESS_GROUP_OUTER belongs to ACCESSES_INNER and
> ACCESS_GROUP_INNER to ACCESSES_OUTER? This cannot be true.
>
>
> The reason is that MDNodes are emitted in topological order of a
> depth-first search. The patch changed the order of ACCESS_GROUP_INNER and
> ACCESS_GROUP_OUTER in line 41 but because in the mitted IR file, the first
> argument of !3 is emitted before the second argument, ACCESS_GROUP_OUTER is
> on line 42 and ACCESS_GROUP_INNER on line 43 instead.
>
> Even though this regression test uses proper regex-ing of metadata node
> numbers, it is still fragile as it fails on irrelevant changes. Once it
> fails, it requires time to fix properly because it is non-obvious which
> lines are intended to match which output lines. The hasty fix would have
> mixed up ACCESS_GROUP_OUTER/ACCESS_GROUP_INNER and I wouldn't expect
> someone who is working on the inliner to take the time to understand all
> the loop metadata. Set-semantics of metadata occurs often, such as noalias
> metadata, loop properties, tbaa metadata, etc.
>
> The difficulty gets amplified when there are multiple functions in the
> regression tests; metadata of all functions all appended to the end and
> MDNodes with same content uniqued, making it impossible to associate
> particular MDNodes with specific functions.
>
>
> Ideally the regression test would be robust and understandable, achievable
> with two asserts in a unittest:
>
>     Loop &OuterLoop = *LI->begin();
>     ASSERT_TRUE(OuterLoop.isAnnotatedParallel());
>     Loop &InnerLoop = *OuterLoop.begin();
>     ASSERT_TRUE(InnerLoop.isAnnotatedParallel());
>
>
> I definitely agree that we should not be trying to do this kind of
> checking using textual metadata-node matching in FileCheck. The alternative
> already available is to add an analysis pass with some kind of verifier
> output. This output, not the raw metadata itself, can be checked by
> FileCheck. We also need to check the verification code, but at least that's
> something we can keep just in one place. For parallel annotations, we
> already have such a thing (we can run opt -loops -analyze; e.g., in
> test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do this kind
> of thing for the cost model (by running with -cost-model -analyze). To what
> extent would making more-extensive use of this technique address the use
> cases you're trying to address?
>
In MLIR we create passes specifically for the purpose of testing this kind
of thing: the pass would traverse a structure and dump it in a textual
form. This is useful as a tool for developers, but mostly makes it really
nice to write FileCheck tests.

For example our dominance test is:
https://github.com/llvm/llvm-project/blob/master/mlir/test/Analysis/test-dominance.mlir
The test pass is here:
https://github.com/llvm/llvm-project/blob/master/mlir/test/lib/Transforms/TestDominance.cpp

-- 
Mehdi





> Maybe we should, also, just make more-extensive use of unit tests?
>
> Thanks again,
>
> Hal
>
>
>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200701/28f5b4b9/attachment-0001.html>


More information about the llvm-dev mailing list