<div class="socmaildefaultfont" dir="ltr" style="font-family:Arial, Helvetica, sans-serif;font-size:10pt" ><div dir="ltr" ><font face="AppleSystemUIFont" size="3" >LoopUnrollAndJamPass is currently a loop pass. It is added in a LPM with only itself.</font><br><font face="AppleSystemUIFont" size="3" >`OptimizePM.addPass(createFunctionToLoopPassAdaptor(LoopUnrollAndJamPass(Level)));`</font><br><font face="AppleSystemUIFont" size="3" >Notice that loops are traversed in an inner to outer order in a LPM.</font><br><br><font face="AppleSystemUIFont" size="3" >The current implementation of LoopUnrollAndJamPass supports only loop nest with one inner loop (L->getSubLoops().size() == 1). </font><br><font face="AppleSystemUIFont" size="3" >Consider the example below:</font><br><font face="AppleSystemUIFont" size="3" >Before loop unroll and jam:</font><br><font face="AppleSystemUIFont" size="3" >```</font><br><font face="AppleSystemUIFont" size="3" >for i</font><br><font face="AppleSystemUIFont" size="3" > for j</font><br><font face="AppleSystemUIFont" size="3" > for k</font><br><font face="AppleSystemUIFont" size="3" > A[I][j][k] = 0;</font><br><font face="AppleSystemUIFont" size="3" >```</font><br><font face="AppleSystemUIFont" size="3" >After loop unroll and jam loop-j with a factor of 2:</font><br><font face="AppleSystemUIFont" size="3" >```</font><br><font face="AppleSystemUIFont" size="3" >for i</font><br><font face="AppleSystemUIFont" size="3" > for j += 2</font><br><font face="AppleSystemUIFont" size="3" > for k</font><br><font face="AppleSystemUIFont" size="3" > A[I][j][k] = 0;</font><br><font face="AppleSystemUIFont" size="3" > A[I][j+1][k] = 0;</font><br><font face="AppleSystemUIFont" size="3" > for j’=j</font><br><font face="AppleSystemUIFont" size="3" > for k</font><br><font face="AppleSystemUIFont" size="3" > A[I][j][k] = 0;</font><br><font face="AppleSystemUIFont" size="3" >```</font><br><font face="AppleSystemUIFont" size="3" >Notice that LoopUnrollAndJamPass can no longer unroll and jam loop-i at the next invocation of LoopUnrollAndJamPass, since there exists two inner loops in loop-i.</font><br><font face="AppleSystemUIFont" size="3" >If LoopUnrollAndJamPass is a function pass, then it can control the order of the loops being considered. By doing the transformation from outer to inner, both loop-i and loop-j can be unroll and jammed. </font><br><br><font face="AppleSystemUIFont" size="3" >In conclusion, I propose to change LoopUnrollAndJamPass from loop to function pass, with the reasons below:</font><br><font face="AppleSystemUIFont" size="3" >1. There is no obvious reason why LoopUnrollAndJamPass need to be a loop pass</font><br><font face="AppleSystemUIFont" size="3" >2. More loops can be transformed by traversing in a outer to inter order</font><br><font face="AppleSystemUIFont" size="3" >3. Less remaining loops are needed if we consider the whole loop nest together</font><br><font face="AppleSystemUIFont" size="3" >4. Better cost model can be created by considering the whole loop nest together</font><br><br><font face="AppleSystemUIFont" size="3" >Regards,</font><br><font face="AppleSystemUIFont" size="3" >Whitney Tsang</font></div></div><BR>