[llvm-dev] LLVM Matrix Multiplication Loop Vectorizer

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 28 05:20:27 PDT 2017


On 06/27/2017 11:36 PM, Serge Preis wrote:
> Hello,
> The message basically means that llvm failed to recognize C[i][j] as 
> valid reduction. In order to make C[i][j] valid reduction is to be 
> privatized into some scalar value for the innermost loop. In you case 
> aliasing analysis fails to prove that C[i][j] never aliases with A and 
> B and this seems correct. So you need something like this to make loop 
> vectorizable:
> be explicit:
> #include <stdio.h>
> #define N 1000
> // This function multiplies A[][] and B[][], and stores
> // the result in C[][]
> void multiply(int A[][N], int B[][N], int C[][N])
> {
>     int i, j, k;
>     for (i = 0; i < N; i++)
>     {
>         for (j = 0; j < N; j++)
>         {
> int res = 0;
>             for (k = 0; k < N; k++)
>                 res += A[i][k]*B[k][j];
> C[i][j] = res;
>         }
>     }
> }
> or just add restrict to arguments:
> // This function multiplies A[][] and B[][], and stores
> // the result in C[][]
> void multiply(int A[restrict][N], int B[restrict][N], int C[restrict][N])
> {
>     int i, j, k;
>     for (i = 0; i < N; i++)
>     {
>         for (j = 0; j < N; j++)
>         {
>             for (k = 0; k < N; k++)
>                 C[i][j] += A[i][k]*B[k][j];
>         }
>     }
> }

I'd advised Hameeza to file a bug report for this. We should be able to 
vectorize this without the restrict by emitting runtime checks.

  -Hal

> On the practical side of things though the following loops reordering 
> should provide much better performance when vectorizsed because in 
> your case you have gather operation (strided load) from B + costly 
> reduce operation in j-loop.
> #include <stdio.h>
> #define N 1000
> // This function multiplies A[][] and B[][], and stores
> // the result in C[][]
> void multiply(int A[][N], int B[][N], int C[][N])
> {
>     int i, j, k;
>     for (i = 0; i < N; i++)
>     {
>         for (j = 0; j < N; j++)
> C[i][j] = 0;
>         for (k = 0; k < N; k++) {
>         for (j = 0; j < N; j++)
>         {
>                C[i][j] += A[i][k]*B[k][j];
>         }
> }
>     }
> }
> 28.06.2017, 02:54, "hameeza ahmed via llvm-dev" <llvm-dev at lists.llvm.org>:
>> Hello,
>> i am trying to vectorize a simple matrix multiplication in llvm;
>> here is my code;
>> #include <stdio.h>
>> #define N 1000
>> // This function multiplies A[][] and B[][], and stores
>> // the result in C[][]
>> void multiply(int A[][N], int B[][N], int C[][N])
>> {
>>     int i, j, k;
>>     for (i = 0; i < N; i++)
>>     {
>>         for (j = 0; j < N; j++)
>>         {
>>             C[i][j] = 0;
>>             for (k = 0; k < N; k++)
>>                 C[i][j] += A[i][k]*B[k][j];
>>         }
>>     }
>> }
>> here are the commands;
>> clang  -S -emit-llvm mat.c -march=knl -O3 -mllvm -disable-llvm-optzns 
>> -o mat.ll
>> opt  -S -O3  mat.ll -o mat_o3.ll
>> llc -x86-asm-syntax=intel mat_o3.ll -o mat_intel.s
>> with this command i got the below error
>> opt  -S -O3 -force-vector-width=16 mat.ll -o mat_o3.ll
>> remark: <unknown>:0:0: loop not vectorized: value that could not be 
>> identified as reduction is used outside the loop
>> it is unable to vectorize the matrix multiplication and in .ll and .s 
>> files i see the scalar instructions.
>> Why is that so? What is my mistake?? Kindly correct me.
>> Looking forward to your reply
>> Thank You
>> ,
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170628/d0eee687/attachment.html>


More information about the llvm-dev mailing list