[llvm-bugs] [Bug 42270] New: Weakness in SCEV/IndVars around exit conditions involving divides of IVs

Thu Jun 13 10:31:25 PDT 2019

https://bugs.llvm.org/show_bug.cgi?id=42270

            Bug ID: 42270
           Summary: Weakness in SCEV/IndVars around exit conditions
                    involving divides of IVs
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Loop Optimizer
          Assignee: unassignedbugs at nondot.org
          Reporter: listmail at philipreames.com
                CC: llvm-bugs at lists.llvm.org

I'm filing this bug to save mental state on an investigation I started, but
ended up deciding wasn't worthwhile to pursue for the moment.  Think of this as
a collection of observations rather than one particular bug.

Consider a test:

; RUN: opt -S -indvars < %s
@A = external global i32
define i32 @simplify_exit_test2(i32 %n) {
entry:
  br label %loop

loop:                                             ; preds = %latch, %entry
  %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
  %iv.next = add i32 %iv, 4
  %fx = udiv i32 %iv, 4
  %c = icmp ult i32 %fx, 1000
  br i1 %c, label %loop, label %exit2

exit:                                             ; preds = %loop
  ret i32 0

exit2:                                            ; preds = %latch
  %fx.lcssa = phi i32 [ %fx, %loop ]
  ret i32 %fx.lcssa
}

In this case, we can see opportunities both for LFTR of the loop exit, and for
rewriting the loop exit in terms of a constant value.  At the moment, we get
neither.  The core reason is that we fail to figure out that %iv.next does not
overflow within the iteration space, and thus fail to form an addrec for %fx. 
Worth noting is that another exit with a computable exit count is enough to
short circuit the analysis problem in most cases; this is a rarer case than it
might first look.  

Note that there are many variations on this test case, each with similar, but
not entirely overlapping sets of issues.  Particularly interesting ones:
replace udiv w/sdiv or ashr or lhsr, mark the udiv exact, step < divider, and
non-constant step or divider.  Amusingly, all of the multiply variants (mul,
shl) appear to mostly work.    

Here's a collection of ideas for enhancements related to this test:
1) Teach IndVars to infer the exact flag on the udiv for a variation with the
add marked nsw/nuw.
2) Extend computeLoopExitsExhaustively to binary search over the possible *max*
taken counts trying to compute the condition for each iteration independently. 
Key observation: there are closed form solutions for many addrecs and we do not
need to brute force from the beginning.  (I prototyped this; it works, but see
next item.)  Warning: This does not give an exact exit count, only a max.
3) We run into a caching problem even with a working max take count
computation.  We end up forming a (zext i32 (0,+,4) to i34) as part of the udiv
distribution check, but because we'd formed the same node before computing the
backedge taken count, we *don't* get the chance to simplify once the critical
nsw/nuw flags have been added.  This is a core problem in SCEV from what I can
tell.
4) Teach IndVars to convert an ashr to a lshr when input is known strictly
positive.  
5) During SCEV construction, canonicalize an ashr on a positive input to a
udiv.  

(1,4,5) are likely simple and would make good starter projects for anyone
interested in this area of code.  (2) is straight-forward, but the only benefit
w/o (3) is to prove nsw/nuw on the add, (3) is hard - very hard.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190613/27455f55/attachment.html>