[llvm-dev] [PATCH] D50328: [X86][SSE] Combine (some) target shuffles with multiple uses
Simon Pilgrim via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 7 12:05:15 PDT 2018
On 06/08/2018 22:25, David Greene via llvm-dev wrote:
> [NOTE: Removed Phab and reviewers]
>
>> ================
>> Comment at: test/CodeGen/X86/2012-01-12-extract-sv.ll:12
>> +; CHECK-NEXT: vblendps {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3]
>> +; CHECK-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,0,0,0]
>> ; CHECK-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
>> ----------------
>> greened wrote:
>>> Can we make this test less brittle by using FileCheck variables?
>>> This goes for pretty much every test in this patch.
>> I'm sorry but no - its been repeatedly proven that using
>> update_llc_test_checks.py on the majority of x86 tests is the way
>> forward - it speeds up creation of tests (x86 by far has the highest
>> test coverage), makes regeneration of checks trivial and it prevents
>> dodgy code being 'hidden' (either on purpose or by
>> accident). Additionally many x86 subtargets have different instruction
>> behaviours depending on the registers used so hidng the registers
>> behind regexps make it that more difficult to track.
> Ok, humor me a bit, as I am not at all familiar with
> update_llc_test_checks.py.
>
> How is it "proven" that the script is better than writing robust and
> focused tests? IME, updating tests to reflect changed behavior is
> problematic. How do we know the updates reflect "better" compiler
> behavior? Suppose I make a change and a bunch of llc tests fail. What
> do I do? I have to examine each one to see if the new behavior is
> "correct" or not. Assuming I think it is, I guess I run
> update_llc_test_checks.py. But I didn't write the test so how do I know
> the new output is "correct?"
Changing a test's IR to avoid an issue in a patch is very problematic,
but if any test's codegen changes because of a patch then it just needs
to be reviewed, preferably by someone who has touched that test in the
past.
> How does the script speed up creation of new tests?
By automating the creation of the CHECKs, allowing the developer to
focus on writing/reducing the IR. There is no way we could have added
the exhaustive tests for much of the custom lowering (vector and scalar
codegen) across so many x86 subtargets without the scripts to do the
checks, it would have taken far too long - people would simply not spend
the time to add as much coverage and we would end up missing things.
> What kinds of dodgy code being hidden are you concerned about? Can you
> provide an example?
It happens in many ways, for example: often in the intro/exit code that
isn't considered directly relevant; or it checks that exactly one
instance of an instruction occurs but doesn't consider that a similar
instruction might be there as well (common with shuffles); or failure to
completely show how 'pre-committed' tests are changed by a patch. The
update script helps ensure we have a whole view of the codegen to avoid
these, and then in review more eyes can check it.
> If a test is looking for different behavior based on register use, then
> of course the test should be written to detect specific register uses.
> If the register use changes, the test fails because a bug was
> introduced.
>
> I would bet 95% of X86 tests don't care what registers are used
> whatsoever. From my perspective, it would be much better to have 95% of
> tests never change over time, leaving a much smaller set of tests that
> may never (ideally will not ever) need to be changed, because they are
> written to expect specific sequences and violations of those are bugs.
True - many test cases don't care about the exact register used. But
register/codegen changes indicates a change in behaviour that can be
investigated and help ensure the patch is doing what you want.
> Tests should be focused. If we're using, for example, a bunch of tests
> develped for shuffle selection as proxies for register allocation,
> that's not good. We should have specific register allocation tests.
I don't know of any examples where we are doing anything so brittle
(although many bug tests could be testing for that "perfect storm", you
can only reduce so far) - we have instructions that can only use certain
registers (e.g. DIV/MUL, PBLENDV's implicit use of xmm0 or EVEX's upper
15 zmm registers) and ensuring that we handle that efficiently can be
more than a regalloc only issue.
> Is this unreasonable? Can we not incrementally improve the tests?
Ensuring tests are focussed and continue to test what they are supposed
to is never unreasonable.
More information about the llvm-dev
mailing list