[PATCH] D136233: [SimpleLoopUnswitch] Inject loop-invariant conditions and unswitch them when it's profitable

Mon Jan 9 21:05:41 PST 2023

apilipenko added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:2963-2966
+  if (auto *MDNode = TI->getMetadata(LLVMContext::MD_make_implicit))
+    NewTerm->setMetadata(LLVMContext::MD_make_implicit, MDNode);
+  if (auto *MDNode = TI->getMetadata(LLVMContext::MD_prof))
+    NewTerm->setMetadata(LLVMContext::MD_prof, MDNode);
----------------
mkazantsev wrote:
> apilipenko wrote:
> > The injected condition doesn't necessarily have the same profile as the original.
> > 
> > I also don't think that we should preserve MD_make_implicit.
> > The injected condition doesn't necessarily have the same profile as the original.
> 
> If the number of iterations in the loop is the same, then it does. Because this condition is invariant, the frequency ratio shows how often it is true or false.
> 
> If the number of iterations in this loop is different from call to call - well, formally, it might not be the same after unswitching. However, it also means that the profile we report here is something collected from cumulative multiple different runs, and is misleading by itself. It could be 0:1 in one run, 1000:0 in another run, and 1000:1 on average. I still think that in this situation we can preserve it, just to denote which loop is more frequent, even if exact numbers won't hold. Having imprecise, but conceptually "this is more frequent than that" numbers is better than having no numbers. Does that make sense?
> 
> > I also don't think that we should preserve MD_make_implicit.
> 
> Why not? I think we can always have this metadata whenever we think that one of branches is super rare. And this unswitching won't change that fact.
> I still think that in this situation we can preserve it, just to denote which loop is more frequent, even if exact numbers won't hold. Having imprecise, but conceptually "this is more frequent than that" numbers is better than having no numbers. Does that make sense?

I would prefer to drop the profile in this case. I suggest removing the preservation of branch weights metadata for now. Not having branch weights is always valid. We can return to this discussion later, once we have the main logic of the optimization checked in.

> Why not? I think we can always have this metadata whenever we think that one of branches is super rare. And this unswitching won't change that fact.

The branch needs to be very rare, *and* there should be a way to heal from this optimization. 
https://llvm.org/docs/FaultMaps.html#make-implicit-metadata

This is way implicit null check optimization is not driven by profile only.

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:3089-3090
+  DenseMap<Value *, SmallVector<CompareDesc, 4> > CandidatesULT;
+  for (auto *DTN = DT.getNode(Latch); L.contains(DTN->getBlock());
+       DTN = DTN->getIDom()) {
+    ICmpInst::Predicate Pred;
----------------
mkazantsev wrote:
> apilipenko wrote:
> > Why do you limit the transform to conditions that dominate the latch?
> Because I prove implications, and the implication can only be proven from something that must execute before the implied condition.
But what you need in fact is dominance between the two branches, not between the branches and the latch.

BTW, do you need to worry about implicit control flow/must execute property here?

================
Comment at: llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:3099-3101
+    // Skip branches that have already been unswithed this way.
+    if (Term->hasMetadata("llvm.invariant.condition.injection.disabled"))
+      continue;
----------------
mkazantsev wrote:
> apilipenko wrote:
> > Why do you need to remember this fact and disable unswitching later?
> If the invariant condition is false, the initial loop-variant condition cannot be proven true or false. Example:
> 
> ```
> for (int i = 0; i < N; i++) {
>   x = arr[i];
>   if (x <0 || x >= 128)
>     deopt();
>   if (x < 0 || x >= len)
>     deopt();
>   ...
> }
> ```
> The transform will insert loop-invariant condition `128 <= len` to get rid of loop-variant check `  if (x < 0 || x >= len)` in one of unswitched copies.
> 
> If we have proven that `128 <= len`, then we have proven that `0 <= x < 128 <= len`, and therefore in the unswitched version 2nd check can be removed. But if we haven't proven that, we do not automatically know anything about 2nd condition. Example: len = 100 but x is still less than 99.
> 
> It means that in the unswitched copy, we should preserve the initial loop-variant condition. And we need to prevent from the optimization go crazy and infinitely unswitch on it.
The fact that you are relying on metadata for this is a bit concerning. There might be a pass that doesn't know anything about this metadata and drops it, leading to an undesired unswitching.

You can probably check for the invariant condition you are about to insert in the dominating conditions. If there such a condition is known to be false -> don't unswitch. But this won't be foolproof either, as the condition can be rewritten in a way that we can't infer it anymore. 

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136233/new/

https://reviews.llvm.org/D136233