<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/62776>62776</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            missed-optimization: reduction builtins do not generate movmskps  and testps instructions
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          VinInn
      </td>
    </tr>
</table>

<pre>
    for this code
int sum() {
   int ret = 0;
   for (int i=0; i<8; ++i) ret +=(0==v[i]);
   return ret;
}

clang generates a movmskps instruction while for the equivalent
int sumV() {
   return __builtin_reduce_add(v==0);
}

does not.
(same for "or" and "and")

see https://godbolt.org/z/4n9jT1ed5 for more details.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJxsUkuPnDAM_jXmEu0IHJ4HDjtFSL1Xex0F4pnJNiTTOEzV_fVVWLov9YKNjT99DxSzuTiiHqojVEOm1nj1oX8y7rtz2eT1n_7sg4hXw2L2miAfIH80LgpeF8AWsBPQHF_HQoi0CRQFyEHkIN8XCQWwTXsDcki71HxrUwN4BDyaBLYd4xHkANjmqcjhDtXRQDUAdh8hA8U1uFTeptAMe7M9Z6vcRVzIUVCRWCix-PvCP28sjOMY1jka78Tvq7EkXnWSoF-ruStLLn4S-_QftTuD02lajY3GnQLpdaaT0hqwvb-yzz_S_kJQe2LhfDzsQ2xZLbSbhT4AolBOpxflNCAmrA_3TCSuMd4Y5CPgCDhevJ68jQcfLoDjC-BYuu75R0G62mAXH0hoispYPmS6l7qTncqoL-q2lNg1dZ1dey2ns-6avJ7Oam7PNKu87gqpmrlSspmmzPSYo8yroikkYlkfCItyPpcFNVNdauygzGlRxh6svS-JTWaYV-prbJo6s2oiy_9-utCnjx6m9cJQ5tZw5PezaKKlfjHMpB_8LZrFvKgUG8hHsfm9ZbgnwEL7ZOhb6O-Jb0ZG4vg5fc7WYPsvJpp4XafD7BfAMTHZy8Mt-GeaI-C4qWHAcRP0NwAA__-5MwBO">