<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/93807>93807</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Floating-point reduction expansion
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:WebAssembly,
            backend:X86,
            vectorization,
            floating-point
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          sparker-arm
      </td>
    </tr>
</table>

<pre>
    My in WebAssembly runtime of choice, I'd like to be able to implement some FP reductions using pairwise operations, for AArch64 codegen I'm currently looking at the Neon instruction 'faddp' and I believe this is the same as haddp from SSE.

My problem is that, once lowered to WebAssembly, the shuffle patterns do not represent pairwise operations so I cannot perform pattern matching to reconstruct the pairwise reduction at runtime.

>From a very brief look, in the X86ISelLowering.cpp `isHorizontalBinOp` looks for the shuffle patterns which I would expect, ones that match even and odd lanes. But, AFAICT, the shuffle patterns produced by `getShuffleReduction` are not the ones we're looking for, except for the case of a two-element vector.

I haven't verified my change to LoopUtils.cpp, but my shuffle generator looks like this:

```
SmallVector<int, 32> ShuffleMask(VF);
for (unsigned stride = 1; stride < VF; stride <<= 1) {
  // Initialise the mask with undef.
  std::fill(ShuffleMask.begin(), ShuffleMask.end(), -1); 
  for (unsigned i = 0; i < VF; i += stride << 1) {
    //ShuffleMask[0] = 1;
    //ShuffleMask[2] = 3;
 //ShuffleMask[4] = 5;
    //ShuffleMask[6] = 7;
    ShuffleMask[i] = i + stride;
  } 
  ...
```
And it causes a whole bunch of test failures... So, the easier option for me, would be to make this a WebAssembly-specific expansion but it seems that it could be useful for other backends with horizontal pairwise operations.

@RKSimon I'd really like some feedback of whether this makes sense and is worth while?
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyEVVGP4jgS_jXmpUQUnEDggQfoHnToZu5Ow-3svDp2hdS2Y0e2A8P8-pVNoGG3d1eKukn8-av6PleVhfd0NIhrNt-y-etEDKG1bu174d7QTYXrJrVVl_WXC5CBX7HeeI9drS_gBhOoQ7ANyNaSRMZfYM94pUDTG0KwUCOIWqef1PUaOzQBvO0Qdv8Dh2qQgazxMHgyR-gFuTN5BNujE2klUjbWwWbjZLsoQVqFRzQpTAdycA5N0BfQ1r5FChEgtAj_QWuAjA_uGgEYrxqhVM94BcIo2EONmvCEEFryQD5t86JDEB7aCIXG2Q4Oh08Zy19Zvrn-_XKB3tlaY3fdJEJM0RqJoO0ZHaoo9sGmuJy426FpNEIvQkBnPCgLxgZw2Dv00ZcP5IO3sAcpTET26BrruhsDdCLINooOFhxKO8pN0e5cd5OjNeOJPQnaRZUCTuguUDvCJnkZsyaTqL4vF_sD6s9RHZljJvse2CIn_y_r6Kc1Qegtmf_2bJGnrT4d2IeSzy3JFvZwtoNWgD96lKN9ePXyqgnwhCYdk1UKtDDoM9gOCbrZbfYv__9LU3tn1SBRQX2JSR4xHK6QrzcfYprCYfI-UqTYZ2S8cngvo8a6GAJ_SOzDXY8UPlW7gHC2UxzL-YQyWPfk6R5acULDeBWXHTWECroLyFaYY-qGz9b2vwTSPtoZQ9VDiIiboCOaWAPWjZZeG6olz4rNYyS2yMcnvR46ofW3lBArXsgkywrOik8w-vBF-DfGl992jK9Ysb1uiwIZXw4mjQIFPjhSCKx4hRkrtu_vL_Bt9_whPRHGV8CqkQ-A8R3jO9gbCiR0LMRoYCf8G5wptDAYhU12Q_ugoq5i05DWjC8fcs1qPJJhfBnz5S-PMjI06n1hOrsqghvpH0VR0pNHCD1IIWB8G1eeNP1J0E3So43zbc7mr3eb_h7Kb9DiHfoRrrzh5v9IubhBqyfoM4huoCR0VPmAZ9Xr3bIsyz4sq41RQAGkGDx6EHBurUaoByPb2A8BfYBGkB4c-izL4GBvDYrCEzqwfRpB8Ui6dE1cJ0CdeqETY22DeBycU9-jpIZkHBTC-EgQu4QCeMRuHBgxrRvX4LEZdIpiQ4sOaiHf0Ch_rbn2Pq8-GrVPDczK_Ou_D9RZM15oDoWO10xsw3R_NYgq0kf95xZTuKQhqvHg0XhMI4w8nK0LbRx-Glmxm6h1oVbFSkxwPatmVb5Y5sty0q5xNq9zzhfzUpbLBks5y5eVaopG5ZLPmnJCa57zMp8X-WzGZ7zMZKmqhZovVvNqxXm5YmWOnSCdaX3qMuuOE_J-wPWqWObVRIsatU_3POejNazYPN1VnPGXp9Xvy8X963XS0c9k2P1ro60IZI7T3qaBEwt94tYxhWk9HD0rc00--PekAgWN693TvoeL6n7ck8HpdRtCn6ZeaoAjhXaoM2k7xneRcPw37Z39Ld0muyTZM767qj6t-e8BAAD__8GG3BE">