<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - Weakness in SCEV/IndVars around exit conditions involving divides of IVs"

   href="https://bugs.llvm.org/show_bug.cgi?id=42270">42270</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Weakness in SCEV/IndVars around exit conditions involving divides of IVs

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Loop Optimizer

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>listmail@philipreames.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>I'm filing this bug to save mental state on an investigation I started, but

ended up deciding wasn't worthwhile to pursue for the moment.  Think of this as

a collection of observations rather than one particular bug.

Consider a test:

; RUN: opt -S -indvars < %s

@A = external global i32

define i32 @simplify_exit_test2(i32 %n) {

entry:

  br label %loop

loop:                                             ; preds = %latch, %entry

  %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]

  %iv.next = add i32 %iv, 4

  %fx = udiv i32 %iv, 4

  %c = icmp ult i32 %fx, 1000

  br i1 %c, label %loop, label %exit2

exit:                                             ; preds = %loop

  ret i32 0

exit2:                                            ; preds = %latch

  %fx.lcssa = phi i32 [ %fx, %loop ]

  ret i32 %fx.lcssa

}

In this case, we can see opportunities both for LFTR of the loop exit, and for

rewriting the loop exit in terms of a constant value.  At the moment, we get

neither.  The core reason is that we fail to figure out that %iv.next does not

overflow within the iteration space, and thus fail to form an addrec for %fx. 

Worth noting is that another exit with a computable exit count is enough to

short circuit the analysis problem in most cases; this is a rarer case than it

might first look.  

Note that there are many variations on this test case, each with similar, but

not entirely overlapping sets of issues.  Particularly interesting ones:

replace udiv w/sdiv or ashr or lhsr, mark the udiv exact, step < divider, and

non-constant step or divider.  Amusingly, all of the multiply variants (mul,

shl) appear to mostly work.    

Here's a collection of ideas for enhancements related to this test:

1) Teach IndVars to infer the exact flag on the udiv for a variation with the

add marked nsw/nuw.

2) Extend computeLoopExitsExhaustively to binary search over the possible *max*

taken counts trying to compute the condition for each iteration independently. 

Key observation: there are closed form solutions for many addrecs and we do not

need to brute force from the beginning.  (I prototyped this; it works, but see

next item.)  Warning: This does not give an exact exit count, only a max.

3) We run into a caching problem even with a working max take count

computation.  We end up forming a (zext i32 (0,+,4) to i34) as part of the udiv

distribution check, but because we'd formed the same node before computing the

backedge taken count, we *don't* get the chance to simplify once the critical

nsw/nuw flags have been added.  This is a core problem in SCEV from what I can

tell.

4) Teach IndVars to convert an ashr to a lshr when input is known strictly

positive.  

5) During SCEV construction, canonicalize an ashr on a positive input to a

udiv.  

(1,4,5) are likely simple and would make good starter projects for anyone

interested in this area of code.  (2) is straight-forward, but the only benefit

w/o (3) is to prove nsw/nuw on the add, (3) is hard - very hard.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>