[llvm-dev] which pass can do following optimization? gvn-sink?

Fangqing Du via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 30 11:24:17 PDT 2021


Thank you Michael!
This info is very useful!

Fangqing

On Fri, Jul 30, 2021 at 11:05 AM Michael Kruse <llvmdev at meinersbur.de>
wrote:

> This kind optimization is done by the LICM pass. Look for
> promoteLoopAccessesToScalars in LICM.cpp. However, it requires the
> loop ocde to be executed unconditionally (or
> isSafeToExecuteUnconditionally). See the justification in the comment
> for promoteLoopAccessesToScalars.
>
> Michael
>
> Am Fr., 30. Juli 2021 um 12:23 Uhr schrieb Fangqing Du via llvm-dev
> <llvm-dev at lists.llvm.org>:
> >
> > Dear all,
> >
> > Imagine we have following code:
> >
> >   1 #define ny 10
> >
> >   2 #define Batch_Size 10
> >
> >   3
> >
> >   4 typedef float data_t;
> >
> >   5
> >
> >   6 void foo(data_t out[ny][Batch_Size], data_t max[Batch_Size]);
> >
> >   7
> >
> >   8 void Softmax_Activation(data_t l_Z2[ny][Batch_Size],
> >
> >   9                         data_t out[ny][Batch_Size]) {
> >
> >  10
> >
> >  11   data_t max[Batch_Size];
> >
> >  12
> >
> >  13 SA_MAX2:
> >
> >  14   for (int i = 0; i < Batch_Size; i++) {
> >
> >  15     max[i] = 0;
> >
> >  16   SA_MAX1:
> >
> >  17     for (int j = 0; j < ny; j++) {
> >
> >  18       if (l_Z2[j][i] > max[i])
> >
> >  19         max[i] = l_Z2[j][i];
> >
> >  20     }
> >
> >  21   }
> >
> >  22   foo(out, max);
> >
> >  23 }
> >
> > we can see 'max[i]' is an invariant variable to loop 'SA_MAX1', so I
> want to know which pass can following following transformation/optimization:
> >
> >   1 #define ny 10
> >
> >   2 #define Batch_Size 10
> >
> >   3
> >
> >   4 typedef float data_t;
> >
> >   5
> >
> >   6 void foo(data_t out[ny][Batch_Size], data_t max[Batch_Size]);
> >
> >   7
> >
> >   8 void Softmax_Activation(data_t l_Z2[ny][Batch_Size],
> >
> >   9                         data_t out[ny][Batch_Size]) {
> >
> >  10
> >
> >  11   data_t max[Batch_Size];
> >
> >  12
> >
> >  13 SA_MAX2:
> >
> >  14   for (int i = 0; i < Batch_Size; i++) {
> >
> >  15     data_t Max = 0;
> >
> >  16   SA_MAX1:
> >
> >  17     for (int j = 0; j < ny; j++) {
> >
> >  18       if (l_Z2[j][i] > Max)
> >
> >  19         Max = l_Z2[j][i];
> >
> >  20     }
> >
> >  21     max[i] = Max;
> >
> >  22   }
> >
> >  23   foo(out, max);
> >
> >  24 }
> >
> > Which will use a local scalar 'Max' to replace the original 'max[i]',
> and sink the original write out of the loop 'SA_MAX1'.
> >
> > I did some experiment with godbolt, looks like currently we don't have
> such kind of optimization.
> > https://godbolt.org/z/9PK3hYvPs
> >
> > Do you know which pass can do this? Or it's not necessary for CPU?
> >
> > Thanks,
> > Fangqing
> > Xilinx Inc.
> >
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210730/d7cf4a61/attachment.html>


More information about the llvm-dev mailing list