[PATCH] D136233: [SimpleLoopUnswitch] Inject loop-invariant conditions and unswitch them when it's profitable

Thu Dec 8 21:30:29 PST 2022

mkazantsev added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:2963-2966
+  if (auto *MDNode = TI->getMetadata(LLVMContext::MD_make_implicit))
+    NewTerm->setMetadata(LLVMContext::MD_make_implicit, MDNode);
+  if (auto *MDNode = TI->getMetadata(LLVMContext::MD_prof))
+    NewTerm->setMetadata(LLVMContext::MD_prof, MDNode);
----------------
apilipenko wrote:
> The injected condition doesn't necessarily have the same profile as the original.
> 
> I also don't think that we should preserve MD_make_implicit.
> The injected condition doesn't necessarily have the same profile as the original.

If the number of iterations in the loop is the same, then it does. Because this condition is invariant, the frequency ratio shows how often it is true or false.

If the number of iterations in this loop is different from call to call - well, formally, it might not be the same after unswitching. However, it also means that the profile we report here is something collected from cumulative multiple different runs, and is misleading by itself. It could be 0:1 in one run, 1000:0 in another run, and 1000:1 on average. I still think that in this situation we can preserve it, just to denote which loop is more frequent, even if exact numbers won't hold. Having imprecise, but conceptually "this is more frequent than that" numbers is better than having no numbers. Does that make sense?

> I also don't think that we should preserve MD_make_implicit.

Why not? I think we can always have this metadata whenever we think that one of branches is super rare. And this unswitching won't change that fact.

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:3056-3069
+///
+/// We can unswitch by condition "C1 <=u C2". If that is true, then "x <u C1 <=
+/// C2" automatically implies "x <u C2", so we can get rid of one of
+/// loop-variant checks in unswitched loop version.
+/// More generally speaking, we can do the same thing to the following checks:
+///
+///   for (...) {
----------------
apilipenko wrote:
> Since you are not handling zexts in this patch, this part of the comment can be dropped.
Will move to follow-up patch.

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:3089-3090
+  DenseMap<Value *, SmallVector<CompareDesc, 4> > CandidatesULT;
+  for (auto *DTN = DT.getNode(Latch); L.contains(DTN->getBlock());
+       DTN = DTN->getIDom()) {
+    ICmpInst::Predicate Pred;
----------------
apilipenko wrote:
> Why do you limit the transform to conditions that dominate the latch?
Because I prove implications, and the implication can only be proven from something that must execute before the implied condition.

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:3099-3101
+    // Skip branches that have already been unswithed this way.
+    if (Term->hasMetadata("llvm.invariant.condition.injection.disabled"))
+      continue;
----------------
apilipenko wrote:
> Why do you need to remember this fact and disable unswitching later?
If the invariant condition is false, the initial loop-variant condition cannot be proven true or false. Example:

```
for (int i = 0; i < N; i++) {
  x = arr[i];
  if (x <0 || x >= 128)
    deopt();
  if (x < 0 || x >= len)
    deopt();
  ...
}
```
The transform will insert loop-invariant condition `128 <= len` to get rid of loop-variant check `  if (x < 0 || x >= len)` in one of unswitched copies.

If we have proven that `128 <= len`, then we have proven that `0 <= x < 128 <= len`, and therefore in the unswitched version 2nd check can be removed. But if we haven't proven that, we do not automatically know anything about 2nd condition. Example: len = 100 but x is still less than 99.

It means that in the unswitched copy, we should preserve the initial loop-variant condition. And we need to prevent from the optimization go crazy and infinitely unswitch on it.

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:3119-3120
+  for (auto &It : CandidatesULT)
+    Found |= insertCandidatesWithPendingInjections(
+        UnswitchCandidates, L, ICmpInst::ICMP_ULT, It.second, DT);
+  return Found;
----------------
apilipenko wrote:
> Can this be done from the loop above?
Yes, but this loop is big already... I wanted to split up the logic to keep code more or less comprehendable.

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:3325-3332
   // If we didn't find any candidates, we're done.
-  if (!collectUnswitchCandidates(UnswitchCandidates, PartialIVInfo,
-                                 PartialIVCondBranch, L, LI, AA, MSSAU))
+  collectUnswitchCandidates(UnswitchCandidates, PartialIVInfo,
+                            PartialIVCondBranch, L, LI, AA, MSSAU);
+  collectUnswitchCandidatesWithInjections(UnswitchCandidates, PartialIVInfo,
+                                          PartialIVCondBranch, L, DT, LI, AA,
+                                          MSSAU);
+  if (UnswitchCandidates.empty())
----------------
apilipenko wrote:
> 
Yup, thanks for pointing out.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136233/new/

https://reviews.llvm.org/D136233