[PATCH] D11605: [LoopUnswitch] Add block frequency analysis to recognize hot/cold regions

Fri Jul 31 16:25:07 PDT 2015

reames requested changes to this revision.
This revision now requires changes to proceed.

================
Comment at: lib/Transforms/Scalar/LoopUnswitch.cpp:76
@@ +75,3 @@
+static cl::opt<bool>
+LoopUnswitchWithBlockFrequency("loop-unswitch-with-block-frequency", cl::init(false), cl::Hidden,
+    cl::desc("Enable the use of the block frequency analysis to access PGO "
----------------
What are your plans for testing this and getting it enabled by default?

================
Comment at: lib/Transforms/Scalar/LoopUnswitch.cpp:82
@@ +81,3 @@
+// The default frequency 20 was chosen based on LoopVectorize's choice of cold frequency.
+static cl::opt<unsigned>
+ColdFrequency("loop-unswitch-cold-block-frequency", cl::init(20), cl::Hidden,
----------------
Have you explored this parameter at all?

My guess would be that disabling loop unswitching should only be done for *really* cold loops.  I have no evidence, but a cool loop in a really hot function seems like something we should unswitch.  

================
Comment at: lib/Transforms/Scalar/LoopUnswitch.cpp:433
@@ +432,3 @@
+  // as cold.
+  const BranchProbability ColdProb(ColdFrequency, 100);
+  ColdEntryFreq = BlockFrequency(BFI->getEntryFreq()) * ColdProb;
----------------
Minor comment clarification: "function entry baseline frequency".  "Loops with headers below this frequency"

================
Comment at: lib/Transforms/Scalar/LoopUnswitch.cpp:496
@@ +495,3 @@
+  // is less than ColdFrequency% of the function entry baseline frequency.
+  BlockFrequency LoopEntryFreq = BFI->getBlockFreq(loopPreheader);
+  if (LoopUnswitchWithBlockFrequency && LoopEntryFreq < ColdEntryFreq)
----------------
chenli wrote:
> I chose to use loop preheader's frequency for coldness comparison. This will ignore the size of the loop because it does not count backedges or iteration numbers. My reasoning is that if the loop was only entered once in the pass 1000 executions, we should be confident that it will never be entered again. Therefore, no matter how many  times the loop iterates, it shouldn't affect our decision. But I could see the other way around. I am open to change it if anyone prefers the other way.
I feel pretty strongly that making decisions based on the number of times the loop executes rather than the number of times the header executes is likely to produce negative results.  

Consider:
while(true) {
  if (loop invariant) {
  }
  if (loop invariant2) {
  }
  if (loop invariant3) {
  }
  do work();
}

Your change would completely disable unswitching in this loop which seems non-ideal.  

================
Comment at: lib/Transforms/Scalar/LoopUnswitch.cpp:498
@@ +497,3 @@
+  if (LoopUnswitchWithBlockFrequency && LoopEntryFreq < ColdEntryFreq)
+    return false;
+
----------------
Given there's already handling for OptForSize within UnswitchIfProfitable, I'd like to see the handling combined.  A cleanup change to move the OptForSize check here - I think that's NFC right? - and then using that in this patch would help.

================
Comment at: test/Transforms/LoopUnswitch/cold-loop.ll:21
@@ +20,3 @@
+continue:
+  br i1 %cond3, label %do_something, label %do_something_else  ; non-trivial condition  
+
----------------
This is perhaps a bad example.  In this particular case, we could split the loop into two subloops without code growth.  Maybe tweak your example so it's more obvious why we want to avoid unswitching this?

http://reviews.llvm.org/D11605