[llvm-dev] IVUsers pass is fragile. Is this okay? How can it be resolved?

Wed Sep 13 15:43:10 PDT 2017

On 09/13/2017 01:01 PM, Daniel Neilson via llvm-dev wrote:
>
> Hi all,
>  I’ve most recently been grappling with a difficult to reproduce bug. 
> I’ve traced the source of the difficulty in reproduction to the 
> IVUsers analysis pass that is used by Loop Strength Reduction. 
> Specifically, the IVUsers pass’s output is very sensitive to both the 
> use list ordering of the instructions that it is looking at and the 
> ordering of the Phi nodes in the header block of the loop that it’s 
> analyzing. I would like to resolve this fragility, but I’m not sure 
> what correct/expected output should look like for IVUsers. Some sort 
> of guiding light documentation or source comments that explain what 
> the analysis is supposed to be doing seem to be missing.
>
>  For example, the following IR will produce different sets of IV users 
> if either:
> i) The order of the PHI nodes in the %loop block are reordered; or
> ii) The uselistorder directive is uncommented
>
> ---
> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
> target triple = "x86_64-unknown-linux-gnu"
>
> define void @test(i64 %v1, i32 %v2, i64* %addr) {
> entry:
>  br label %loop
>
> loop:
>  %iv = phi i64 [%v1, %entry], [%iv.inc, %loop]
>  %iv2 = phi i32 [%v2, %entry], [%5, %loop]
>  %0 = trunc i64 %iv to i32
>  %1 = sub i32 %iv2, %0
>  %2 = sitofp i32 %1 to double
>  %3 = sub i64 0, %iv
>  %4 = trunc i64 %3 to i32
>  %5 = sub i32 %1, %4
>  %iv.inc = add i64 %iv, 1
>  store i64 %iv.inc, i64* %addr, align 8
>  br i1 undef, label %loop, label %exit
>
> exit:
>  ret void
>
> ; uselistorder directives ----
> ; uselistorder i64 %iv, {2, 1, 0}
> }
> —
>
>  IVUsers output (opt -analyze -iv-users -S < file.ll):
> 1) IR as-is:
>   %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in   %1 = sub i32 %iv2, %0
>   %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 
> = sitofp i32 %1 to double
>   %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %5 
> = sub i32 %1, %4
>   %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* 
> %addr, align 8
>
> 2) Original phi ordering with uselistorder uncommented
>   %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 
> = sitofp i32 %1 to double
>   %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in   %5 = sub i32 
> %1, %4
>   %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* 
> %addr, align 8
>   %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc 
> i64 %v1 to i32)))}<%loop> in    %1 = sub i32 %iv2, %0
>
> 3) Re-order Phi nodes:
>   %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in   %1 = sub i32 %iv2, %0
>   %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 
> = sitofp i32 %1 to double
>   %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in   %5 = sub i32 
> %1, %4
>   %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* 
> %addr, align 8
>
>  As you can see, there’s clearly commonalities in these results but 
> there are also SCEVs that will, or will not, appear in the IVUsers 
> results depending on just simple differences in phi-instruction or use 
> list ordering.
>
>  Basically, IVUsers starts a series of DFTs at each of the phi nodes 
> in the loop header. Each DFT traverses through the def-use chains of 
> defs with “interesting” SCEVs. At each step of the DFT it is deciding 
> whether to add users of the current instruction’s SCEV to the IVUsers 
> set based on properties of the user itself. The reason for the 
> differences that I am seeing in these IVUsers results is that if a 
> particular user is seen more than once during this series of DFTs then 
> the subsequent times can result in a different decision for the def’s 
> SCEV.
>
>  To clarify what I mean, I’ll go back to the example…
>
>  In version (1), we find ourselves looking at the users of the SCEV 
> for %1 = sub i32 %iv2, %0. We decide to add both users (%2 = sitofp 
> i32 %1 to double & %5 = sub i32 %1, %4) as IVUsers of the SCEV for %1. 
> Specifically, we add the def of %5 as a user of %1’s SCEV because we 
> have seen the def of %5 as a user earlier in the traversal.
>
>  In version (2), we similarly find ourselves looking at the users of 
> the SCEV for %1 = sub i32 %iv2, %0. However, this time we only add "%2 
> = sitofp i32 %1 to double" as a user of the SCEV for %1. Unlike in 
> (1), we do not add the def of %5 as a user of %1’s SCEV because this 
> is the first time (in any DFT) that we have seen the def of %5 and the 
> def of %5 is considered to be an “interesting” SCEV.
>
>  So, to get back to the original questions:
> 1) What exactly is IVUsers supposed to be finding? For instance, in 
> the example above, what would be the ideal/correct set of IVUsers that 
> the analysis should be finding?

I don't have a good answer to this question, other than the obvious one 
(that it's supposed to find all values that have interesting SCEV 
expressions making use of the induction variable). Interesting here 
means that it's an affine addrec or has a starting value that is one 
(recursively). It also helps support post-inc transformations.

The problem here is that we're trying to avoid calling getSCEV on all 
instructions just to see if they end up being an addrec of the given 
loop. Maybe we should do this in two steps? First, walk the users to 
find the instruction on which to call getSCEV. Then, go through the 
instructions in BB order, calling getSCEV on those identified instructions.

>
> 2) Is it acceptable that there is this sort of difference in the 
> IVUsers analysis results based on nothing more than instruction or use 
> list ordering? I personally hope not; it has been mildly infuriating 
> having to narrow down on a bug with this difference in place.

Technically speaking, the dependency on the order of the phis is okay 
(i.e., it's possible they'll be no good way to avoid that). Not having 
it is clearly better. The dependency on the use-list ordering is highly 
discouraged. As you've noticed, this makes problems very hard to track down.

Part of the problem here may be that, because what SCEV proves, and thus 
returns, is dependent on what SCEV's have been previously constructed 
(unfortunately), it's not hard to develop these kinds of 
processing-order dependencies with analyses that use SCEV.

  -Hal

>
> Thanks,
>  Daniel
>
> ---
> Daniel Neilson, Ph.D.
> Azul Systems
>
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170913/46e2f269/attachment.html>