[PATCH] D46008: [X86][AArch64][NFC] Add tests for vector masked merge unfolding

Wed May 9 13:43:23 PDT 2018

lebedev.ri added inline comments.

================
Comment at: test/CodeGen/X86/unfold-masked-merge-vector-variablemask-const.ll:39-41
+  %x = load <4 x i32>, <4 x i32> *%px, align 16
+  %y = load <4 x i32>, <4 x i32> *%py, align 16
+  %mask = load <4 x i32>, <4 x i32> *%pmask, align 16
----------------
spatel wrote:
> What's the reason for using loads in these tests?
They cut the count of check-lines for these <4 x i32> tests for SSE1,
since it can't be natively passed as argument.
https://godbolt.org/g/azr1NQ

================
Comment at: test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=-sse,-sse2 < %s | FileCheck %s --check-prefix=CHECK-BASELINE
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+sse,-sse2 < %s | FileCheck %s --check-prefix=CHECK-SSE
----------------
spatel wrote:
> lebedev.ri wrote:
> > lebedev.ri wrote:
> > > RKSimon wrote:
> > > > lebedev.ri wrote:
> > > > > spatel wrote:
> > > > > > Why is this config interesting? IMO, it just distracts from the cases that we do care about, but I may not be seeing it.
> > > > > I think we do want to check that we don't do anything stupid in `andn`-less case,
> > > > > much like without `BMI` in scalar case. 
> > > > I agree, the non-SSE tests aren't very useful (and I'm a little dubious about SSE1 tbh).
> > > > 
> > > > What MIGHT be useful is a XOP pass:
> > > > 
> > > > ; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+sse,-sse2 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-SSE,CHECK-SSE1
> > > > ; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+sse,+sse2 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-SSE,CHECK-SSE2
> > > > ; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+xop < %s | FileCheck %s --check-prefix=CHECK,CHECK-XOP
> > > > 
> > > > As XOP is the only X86 ISA with a bsl style vector instruction (PCMOV) - maybe add this to both x86 test files?
> > > If i drop baseline tests, which tests will verify what we do in that case?
> > > I know i needed them when working on D46528, i don't think anything else tests that.
> > > 
> > > Will look into `XOP`..
> > BTW, is this `--check-prefixes=` magic documented somewhere?
> > Especially, why is the `CHECK,CHECK-SSE,CHECK-SSE2` the correct order that works,
> > while `CHECK,CHECK-SSE,CHECK-SSE1,CHECK-SSE2` results in lost data?
> > I'm asking because i have tried the second variant, and failed.
> > Now that i have seen the first variant, i guess i understand the logic, but still..
> I didn't experiment with this particular case, but I think you hit a long-standing shortcoming in update_llc_test_checks.py:
> 
> ```
>       # FIXME: We should use multiple check prefixes to common check lines. For
>       # now, we just ignore all but the last.
> 
> ```
> ...so it only matches common lines based on the order that they are encountered/created.
Yes, i did read that tool, and did find where the problem happens
`/build/llvm/utils/UpdateTestChecks/common.py`
```
def build_function_body_dictionary(function_re, scrubber, scrubber_args, raw_tool_output, prefixes, func_dict, verbose):
.....
    for prefix in prefixes:
      if func in func_dict[prefix] and func_dict[prefix][func] != scrubbed_body:
        if prefix == prefixes[-1]:
          print('WARNING: Found conflicting asm under the '
                               'same prefix: %r!' % (prefix,), file=sys.stderr)
        else: 
          func_dict[prefix][func] = None                                              <-    HERE
          continue

      func_dict[prefix][func] = scrubbed_body

```

Repository:
  rL LLVM

https://reviews.llvm.org/D46008