[PATCH] D24451: [LoopUnroller] Replace UnrollingPreferences::Force with ForceMaxCount + SystemZ getUnrollingPreferences().

Jonas Paulsson via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 14 09:22:15 PDT 2016


jonpa added a comment.

Chandler,

On "SPEC", considering only final loops that ended up as 8 machine instructions or less,
with *my patch without forced unrolling*, I get that
`

     48  loops were unrolled, but ended up <= 8 MIs.
  10353  loops were not unrolled, because:
    865  pragma disabled (from loop vectorizer -- scalar loop)
    342  UnrolLLoop() : Any of the first five "return false" (!Preheader or !LatchBlock or !L->isSafeToClone() or (!BI || BI->isUnconditional()))
   3174  Loops containing calls - disabled purposefully for SystemZ
   5969  Loops where UnrollLoop() return false if UnrollRuntimeLoopRemainder() returns false, whereof:
         2300  Any of the first three "return false" (!L->getExitingBlock() or !L->isLoopSimplifyForm() or !Exit)
         3669  4th "return false",   // Only unroll loops with a computable trip count, and the trip count needs to be an int value (allowing a pointer type is a TODO item)
  `

Same, but the complete patch including forced unrolling:

   606  loops were unrolled, but ended up <= 8 MIs.  (I happen to know that they are practically all >6, though ;-)
  4434  loops were not unrolled, because:
   865  pragma disabled (from loop vectorizer -- scalar loop)
   342  UnrolLLoop() : Any of the first five "return false" (!Preheader or !LatchBlock or !L->isSafeToClone() or (!BI || BI->isUnconditional()))
  3174  Loops containing calls - disabled purposefully for SystemZ
    54  Loops where UnrollLoop() return false if UnrollRuntimeLoopRemainder() returns false, whereof:
         24  Any of the first three "return false" (!L->getExitingBlock() or !L->isLoopSimplifyForm() or !Exit)
         30  4th "return false",   // Only unroll loops with a computable trip count, and the trip count needs to be an int value (allowing a pointer type is a TODO item)

It is quite clear that for runtime unroller cannot currently handle all loops. The reasons seem to realate to the LoopInfo and SCEV classes.
It is also clear that forced unrolling basically handles nearly everything left.

Evegeny,

I see your point that there is already a loop size estimate available. This however is not an estimate only of instruction count, but also weighs in
execution factors.

My background here is that first of all I am interested in the resulting machine instruction count only, so e.g. a div should only count as one. Therefore I am doing

if (getUserCost(I) != TargetTransformInfo::TargetCostConstants::TCC_Free)
​        MICountEstimate++;

This patch as it is now gets rid of basically all loops (except those with calls and pragma unroll disable) smaller than 6 instructions, by using an MICountEstimate
of 12. I found that an estimate of 12 will include 99% of loops <= 8 MIs, and that the resulting ForceMaxCount of ceil(12 / MICountEstimate), limited by 3, got
rid of practically 99,9 of loops <= 6 MIs. This is exactly what was the goal with the patch, and this is a very good result that would be a shame to change.

Having worked with this patch for a while, it has become clear that the *exact number of iterations* produced needs to be controlled. The resulting loop should have no more than 12 stores, and it seems bad to have more than 3 exit branches w/ forced unrolling.

This patch did not work very well before, so it was the forced unrolling in exactly this setting that made it all work (both performance wise, and also achieving the objective of getting rid of all small loops). Therefore I would be very reluctant to change any functionality.

So if you really object to this, I migh need to still do my own MICountEstimate, compute the MaxForceCount, and then somehow access the LoopSize estimate of the LoopUnrollPass, and in this indirect way give a new variable ForceThreshold a value that give the same results.


https://reviews.llvm.org/D24451





More information about the llvm-commits mailing list