[llvm-dev] Making loop guards part of canonical loop structure

Tue May 28 06:46:27 PDT 2019

Hi all,

TL;DR: Should the loop guard branch be part of the canonical loop structure?

Background:
-----------
While working on the next set of improvements to loop fusion, we discovered that
loop rotation will put a guard around a loop if it cannot prove the loop will
execute at least once, as seen in the following (simplified) example:

entry:
  br i1 %cmp1, label %for.body.lr.ph, label %for.cond.cleanup

for.body.lr.ph:                                   ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %for.inc
  br label %for.inc

for.inc:                                          ; preds = %for.body
  br i1 %cmp, label %for.body, label %for.cond.for.cond.cleanup_crit_edge

for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.inc
  br label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.cond.for.cond.cleanup_crit_edge, %entry
  br label %for.end

for.end:                                          ; preds = %for.cond.cleanup
  ret void

However, for loops that can be proven to execute at least once, there is no
guard.

This creates complications for several loop optimizations because they need to
handle both guarded loops and non-guarded loops. For example, the current loop
fusion pass needs to check whether two loops are control flow equivalent before
fusing them (i.e., if the first loop executes, the second loop is guaranteed to
execute). This is currently done by checking that the preheader of the first
loop dominates the preheader of the second loop, and the preheader of the second
loop post-dominates the preheader of the first loop. When one (or both) of the
loops have a guard, then this check no longer works. If the loop guard was part
of the canonical form, then this check could be done using the loop guard,
instead of the preheader.

Similarly, when checking for perfect loop nests, a loop guard around the inner
loop will make the nest appear imperfect. However, if the guard is considered to
be part of the inner loop, it is easier to prove a perfect nest.

Both of these examples have alternative ways to deal with the loop guards, if
they are present. However, I would like to have the discussion regarding making
guards part of the canonical loop structure. If people are supportive of this,
we can start doing the work to implement the guards and start modify loop
optimizations to deal with them appropriately. However, if this is not the
direction we want to go, we can deal with the guards in the specific
optimizations where they are relevant.

Proposed Solution:
------------------
Ensure that all loops in canonical form have a guard branch that dominates the
preheader. If such a guard branch does not exist, a trivial guard will be
created.

I have not looked at the implementation of this
extensively yet because I wanted feedback on this direction first. However, my
initial thought is to modify LoopSimplify to add the concept of a loop guard,
and provide similar guarantees that it currently provides for preheaders and exit
blocks. Specifically, if a guard block is not found, one is created and inserted
immediately before the loop preheader to guarantee a structure similar to the
one in the example above. Note that as with the preheader and exit blocks, it is
possible that subsequent passes (e.g., SimplifyCFG) may remove these blocks thus
taking it out of canonical form. However, this is an existing situation that
loop optimizations already deal with.

Any comments/feedback are greatly appreciated.

Thanks,

Kit