[PATCH] D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available.

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 9 08:40:21 PST 2020


Ayal added a comment.

> The LoopVectorizer/LAA has the ability to add runtime checks for memory accesses that look like they may be single stride accesses, in an attempt to still run vectorized code. This can happen in a boring matrix multiply kernel, for example:



  for(int i = 0; i < n; i++) {
    for (int j = 0; j < m; j++)
    {
      int sum = 0;
      for (int k = 0; k < l; k++)
        sum += A[i*l + k] * B[k*m + j];
      C[i*m + j] = sum;
    }
  }

Note that a (more boring?) matrix multiply kernel where B is a square matrix, i.e., where stride m is equal to trip count l, will not be specialized for m=1. But this general case may multiply matrix A by a single column matrix B, whose stride m is 1.

Another possible way to prevent such undesired specialization may be with a __builtin_expect/llvm.expect(m>1, 1).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71919/new/

https://reviews.llvm.org/D71919





More information about the llvm-commits mailing list