<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/141768>141768</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [LV] Maximum VF does not consider scaled reductions
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          preames
      </td>
    </tr>
</table>

<pre>
    Reproducer: https://godbolt.org/z/4xf7c8GMM



It looks like the vectorizer has not yet been updated to consider scaled reductions (a.k.a. multiply-accumulate with extended operands) in the VF selection logic.  In this case, if my tracing through the debug output is correct, we consider the widest type in the loop to be an i32 and select a maximum VF to cost based on that.  This results in a loop which is running at 1/4 of the width it should be.  It's still more profitable than not using the zvqdotq (scaled reduction) lowering, but also isn't ideal.  

int doti32_i8_sext(char *a, char *b, int N) {
 int sum = 0;
  for (int i = 0; i < N; i++) {
    int a32 = a[i];
    int b32 = b[i];
    sum += a32 * b32;
  }
  return sum;
}

// -O3 -x c++ -march=rv64gcv_zvqdotq0p0 -menable-experimental-extensions
.LBB0_5:
 vsetvli a5, zero, e8, mf2, ta, ma
        vle8.v  v9, (a3)
        vle8.v v10, (a4)
        add     a4, a4, t0
        vsetvli a5, zero, e32, mf2, ta, ma
        vqdotu.vv       v8, v10, v9
        add     a3, a3, t0
 bne     a4, a7, .LBB0_5
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyElN-P26gTwP8a8jKK5WBn7Tz4IWmUryp1-5VOp76uMExibjG4MDjJ_vUnSLLba3s6C4Hx_OAzM2ZECPpkETu23rH1fiEiDc53k0cxYlj0Tl27P3DyTkWJnlVbGIimwKot4wfGDyenemeocP7E-OGN8UN9OTay_d_zMyu37-MzgXHuNYDRrwg0IMwoyXn9hh4GEcA6gisS9IgW4qQEoQJyIJ0NWqGHIIVBBR5VlKSdDcB4K4rXQhQwRkN6MtelkDKO0QhCOGsaAC-EVqECN6EXVgXGN6BtBvh2gIAGszMw7qRlAfA5yXQAKQIy_gn0EcYrkBdS2xPQ4F08DdlcYR9P4CJNkSBZOO9RUjI64wd2Uj1rhYGArhM-DjfOTSm8HkFY0BUHYdWdBwSM4qLHOCbGnINA0IuQ4kjmggqAPxOmxxANheRV3HyeBy2HxOOjtYlZEKxSVcAdHzA0gCYIg4tGQY8pbGK8CRBIGwOj8wiTd0dNojepWsLm-sRwSwLC2_xdOfqeSvBzXVKGjTuj1_aUktFHAmGCAx0s4w2BVihMAXD7MbQlUI50xV90-xLwQoy3chAeGN-K5OCx6XM9LMHXdARrdqzc5n2II7BqDyWr8jc4umTQJpl-l-TXT_A1vzK-y-PdD0B2JSqeDQRb7zRb7x8eb9L-Lu1_kWYEvsumSYlvk_JDzpp9Xj1S9DYp3yS372nkqwTL_1ewvIC80cFyFF4OrNr7-ak-yfnlnvVyKmE5ok3FWeJlQq9HtCTMMv_uIV0OVm6LL7td-bJON7XcwhyQZqNBrFMa39C7tGKb5vHI00I53aO4x5Se2WBbzADzJonShasY3_yqMK_Kh0L9TwWh1G2tk8JtpvJHD78Hq_h_kKVUxGKeH_scyZ1j_i1BlQmqD4Le4o9sTZrvWVuorlKbaiMW2K2auq3rsm02i6ETRyGF6hu1elpv-rJWKynapqlk3RwrsakWuuMlX5dr3q7qcrVqC_XUSl61fNO0ouLHmtUljkKbwph5TH1zoUOI2K3qVfPULozo0YTcjjm3eIYsZZyn7uy7ZLTs4ymwujQ6UPhwQ5pM7uNfvrH1Hp4_eohyeOuw_95NF9Gb7qfWrmmIfSHdyPghnXJflpN3f-VWd8hsgfHDHX7u-N8BAAD__yRx7P8">