<div dir="ltr"><div dir="ltr">Am Mi., 1. Juli 2020 um 10:18 Uhr schrieb Robinson, Paul <<a href="mailto:paul.robinson@sony.com">paul.robinson@sony.com</a>>:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div class="gmail-m_-7796211237380652251WordSection1">
<p class="MsoNormal">The test as written is fragile because it requires a certain ordering. If the output order is not important, use CHECK-DAG rather than CHECK. This would be a failure to understand the testing tool.</p></div></div></blockquote><div><br></div><div>CHECK-DAG does not help here since what changes is within a list on the same line, and we have no CHECK-SAME-DAG or CHECK-DAG-SAME. Even if we had it, the actual line that changed is textually the same and FileCheck would need to backtrack deep into the following lines for alternative placeholder substitutions. It would look like</div><div><br></div><div>
<font face="monospace">CHECK-SAME-DAG: </font><span style="font-family:monospace">![[ACCESS_GROUP_INNER:[0-9]+</span><span style="font-family:monospace">]]</span></div><div>
<font face="monospace">CHECK-SAME-DAG: </font><span style="font-family:monospace">,</span></div><div>
<font face="monospace">CHECK-SAME-DAG: </font><span style="font-family:monospace">![[ACCESS_GROUP_OUTER:[0-9]+]]</span></div><div><br></div><div>which allows the comma to appear anywhere and I don't find readable.</div><div><br></div><div>My (naive?) conclusion is that textural checking is not the right tool here.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_-7796211237380652251WordSection1"><br>
<p class="MsoNormal">My experience, over a 40-year career, is that good software developers are generally not very good test-writers. These are different skills and good testing is frequently not taught. It’s easy to write fragile tests; you make your patch,
you see what comes out, and you write the test to expect exactly that output, using the minimum possible features of the testing tool. This is poor technique all around. We even have scripts that automatically generate such tests, used primarily in codegen
tests. I devoutly hope that the people who produce those tests responsibly eyeball all those cases.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">The proposal appears to be to migrate output-based tests (using ever-more-complicated FileCheck features) to executable tests, which makes it more like the software development people are used to instead of test-writing. But I don’t see
that necessarily solving the problem; seems like it would be just as easy to write a program that doesn’t do the right thing as to write a FileCheck test that doesn’t do the right thing.</p></div></div></blockquote><div><br></div><div>IMHO having a tool that allows to better express what is intended to be tested is already worth a lot. For instance, we usually don't care about SSA value names or MDNode numbers, but we have to put extra work into regex-ing away those names in FileCheck tests and as a result, most tests we have do still expect the exact number for metadata nodes. This is a problem if we we want to emit new metadata nodes in that all those tests need to be updated.</div><div><br></div><div>This problem goes away if the test method by default ignored value names/MDNode numbers and software development people had to put extra work if they actually want to verify this.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_-7796211237380652251WordSection1"><p class="MsoNormal"><u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Hal’s suggestion is more to the point: If the output we’re generating is not appropriate to the kinds of tests we want to perform, it can be worthwhile to generate different kinds of output. MIR is a case in point; for a long time it
was hard to introspect into the interval between IR and final machine code, but now it’s a lot easier.</p></div></div></blockquote><div><br></div><div>Can you elaborate on what makes it easier?</div><div><br></div><div><br></div><div>Michael</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_-7796211237380652251WordSection1"><div style="border-top:none;border-right:none;border-bottom:none;border-left:1.5pt solid blue;padding:0in 0in 0in 4pt"><div>
</div>
</div>
</div>
</div>
</blockquote></div></div>