[PATCH] D46008: [X86][AArch64][NFC] Add tests for vector masked merge unfolding

Tue Apr 24 08:04:58 PDT 2018

spatel added a comment.

> Also, i think one more piece is missing - shouldn't the blend instruction be used?

If you're thinking of pblendvb or similar SSE4.1 instructions: "The mask bits are the most significant bit in each...element."
So those aren't bitwise selects, they're element-size selects. There might be some flavor of AVX512 that has a bsl equivalent? I still don't know AVX512.

I'm usually in favor of more thorough regression testing, but this patch is where computing reality takes over for me. :)
The x86 file is at least 10x bigger than I'd prefer. This level of testing permutation is not sustainable. At some point, running 'make check' will take so long for x86, that people will stop doing it. This may already be happening...personally, I used to run 'make check' for x86 with debug builds, but it takes too long now, so now I only check with release+asserts and even that takes too long on a portable.

This is all just my opinion. Afaik, there are no official LLVM guidelines on this, so if the consensus is that this level of testing is better, I'll live with it.

But there are a few simple ways to shrink this patch:

1. One illegal vector type (or 1 smaller-than-legal and 1 larger-than-legal) is fine to prove that we don't do something crazy; >1 doesn't add much value.
2. x86 adds to the vector ISA with every chip, but there's little or no difference to this transform, so SSE1 and AVX1 (and 1 flavor of AVX512?) are all that's useful.
3. Use multiple '--check-prefix' options per RUN to reduce the number of CHECK lines (I think '--check-prefixes' should work too). See existing vector test files for examples.

Repository:
  rL LLVM

https://reviews.llvm.org/D46008