<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/123456>123456</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            missed optimization to simple vptest when using libstdc++ std::experimental::simd to detect all-zero pattern on x86-64 avx
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          ImpleLee
      </td>
    </tr>
</table>

<pre>
    The following code uses libstdc++ experimental simd, and wants to detect several all-zero patterns that can be easily done with the vptest instructions. All the code is available at https://godbolt.org/z/Kx68E1T6v .

```c++
#include <experimental/simd>
#include <cstdint>
namespace stdx = std::experimental;

template <class T, std::size_t N>
using simd_of = stdx::simd<T, stdx::simd_abi::deduce_t<T, N>>;

using data_t = simd_of<std::int32_t, 4>;

bool simple_ptest(data_t x) {
    return all_of(x == 0);
}

bool ptest_and(data_t a, data_t b) {
    return all_of((a & b) == 0);
}

bool ptest_andn(data_t a, data_t b) {
    return all_of((a & ~b) == 0);
}
```

Equivalent assembly (hand-written):

```asm
simple_ptest:
        vptest  %xmm0, %xmm0
 sete    %al
        ret
ptest_and:
        vptest  %xmm0, %xmm1
 sete    %al
        ret
ptest_andn:
        vptest  %xmm0, %xmm1
 setc    %al
        ret
```

But clang++ generates the following code at `-O3 -march=x86-64-v3`.

```asm
simple_ptest(std::experimental::parallelism_v2::simd<int, std::experimental::parallelism_v2::simd_abi::_VecBuiltin<16>>):
 vpxor   xmm1, xmm1, xmm1
        vpcmpeqd        xmm0, xmm0, xmm1
 vpcmpeqd        xmm1, xmm1, xmm1
        vptest  xmm0, xmm1
        setb al
        ret

ptest_and(std::experimental::parallelism_v2::simd<int, std::experimental::parallelism_v2::simd_abi::_VecBuiltin<16>>, std::experimental::parallelism_v2::simd<int, std::experimental::parallelism_v2::simd_abi::_VecBuiltin<16>>):
 vpand   xmm0, xmm1, xmm0
        vpxor   xmm1, xmm1, xmm1
        vpcmpeqd xmm0, xmm0, xmm1
        vpcmpeqd        xmm1, xmm1, xmm1
 vptest  xmm0, xmm1
        setb    al
 ret

ptest_andn(std::experimental::parallelism_v2::simd<int, std::experimental::parallelism_v2::simd_abi::_VecBuiltin<16>>, std::experimental::parallelism_v2::simd<int, std::experimental::parallelism_v2::simd_abi::_VecBuiltin<16>>):
 vpandn  xmm0, xmm1, xmm0
        vpxor   xmm1, xmm1, xmm1
        vpcmpeqd xmm0, xmm0, xmm1
        vpcmpeqd        xmm1, xmm1, xmm1
 vptest  xmm0, xmm1
        setb    al
        ret
```

reference: [the same issue at gcc bugzilla](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118416) (identified as a duplicate to another missed optimization).
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsV81u4zYQfhr6MrAhkbYiH3ywkzVQtGgvi16NETm2WFCUVhzZ3hz67AUleeN4kyDZAkVRVBAgiZr55u_jkMQQ7METrcRiIxYPE-y4rNvVT1Xj6BeiSVGbr6vPJcG-dq4-WX8AXRuCLlAAZ4vARgu5EXIDdG6otRV5RgfBVkbIe0Bv4ISeA3ANhpg0Q6AjtegAnZs-UltDg8zU-gBcIoNGDwUBYbDuK5jaE5wsl8AlwbFhCgzWB247zbb2YQZr5_qfvWM2AB7ROiwcATKUzE0Qai3kVsjtoTZF7XhWtwcht49Cbn8-Z_mn9HN2hJlI1vHOkuEe44pDUlmvXWcIhLq_jlPIbR-p-vSdmA5srOfhl8eKQoOaILA5g1AP8SW6pdbP8NRm8IKpahzygOQwBPgc0_lNKdhH2jH8OsB3IRYmerKr9xf080Uy-nd_Ub8a3WFhhy9DptO044tcDxvv0ZsB3yDjjgf4wZRQ9988sp6V3HFUn1_rFnXd06FxtOurJ2Q-Ip2FXIK4i4IAAC1x1_pIiwgt8z5P0Voi5HLEu3u4Qu3hdujNEyRG--N78Ta8kDmCkNko905T_u_Z-vNNYxfqDYY_fensER15BgyBqsJ9BSHzEr2ZnlrLTL4HWN8QF0MlkvWzlPcyMF7jJAIhF-eqSmIUl9dkDYGYopSQC3RXWi2xSNZPKX8fZPoBSP9BTP065k0mNx2DdugPY6s6kKcWmULfN25aGzKILJn-pmBaYatLoR7OeTbN5tOjElly2yZeyLbMX5nccaTBFp0jZ0O1O8pnczS2i-tJ_n7lp6m8-530prOOrRfqPs3GmXzhCRybc90CQJ9Hef_8eZ18XTX0xVy-LxW4eqYD3Hdyb8MONb2FGa9AXMBL5bwm3r8xuz8E-49VPC7CNzm_lPK6NB8jxmuMeJ1AL8K-hxEAIym-Z4P_nw4_Qgf_n6DDWy2_pT215DUJtQax2MRGH7CKG8TQ9U3-oDUU3eHROodi8RBX1ud7Ra1nB9-Ne8VvknIbyvq0K7rDTB-sUFtrhHpI03yeZv3aLnNryLPdWzKAARBM1zir426Oa0Bfc0ktVDYEMlA3bCv7iHEvK-RyNjErZZZqiRNapXfqLs2XeZpOylVh1D6X-1zhMs11IhdZOl_kyyRbGJynSk3sSiZykaRpnuRzpdIZ7pcFFfvCqCQ1uNiLeUIVWjdz7ljFsCZ9LlapVPNFNnFYkAv9QUBKT6chU0LKeC5oV1FpWnSHIOaJs4HDEwxbdrR6IaAY77AyXgp7KsnDsJ18fnp4g_OR31fHh9tjA9QehgUa8HiedK1b3RTSctkVM11XQm6j0-Nj2rT1H6RZyG0fahByO-biuJJ_BQAA___WefpK">