[llvm-dev] IVUsers pass is fragile. Is this okay? How can it be resolved?

Thu Sep 14 19:30:08 PDT 2017

On 09/14/2017 10:43 AM, Daniel Neilson wrote:
> Thank you for your thoughts, Hal. More information below...
>
>> On Sep 13, 2017, at 5:43 PM, Hal Finkel <hfinkel at anl.gov 
>> <mailto:hfinkel at anl.gov>> wrote:
>> On 09/13/2017 01:01 PM, Daniel Neilson via llvm-dev wrote:
>
> … snip
>
>>>  For example, the following IR will produce different sets of IV 
>>> users if either:
>>> i) The order of the PHI nodes in the %loop block are reordered; or
>>> ii) The uselistorder directive is uncommented
>>>
>>> ---
>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
>>> target triple = "x86_64-unknown-linux-gnu"
>>>
>>> define void @test(i64 %v1, i32 %v2, i64* %addr) {
>>> entry:
>>>  br label %loop
>>>
>>> loop:
>>>  %iv = phi i64 [%v1, %entry], [%iv.inc, %loop]
>>>  %iv2 = phi i32 [%v2, %entry], [%5, %loop]
>>>  %0 = trunc i64 %iv to i32
>>>  %1 = sub i32 %iv2, %0
>>>  %2 = sitofp i32 %1 to double
>>>  %3 = sub i64 0, %iv
>>>  %4 = trunc i64 %3 to i32
>>>  %5 = sub i32 %1, %4
>>>  %iv.inc = add i64 %iv, 1
>>>  store i64 %iv.inc, i64* %addr, align 8
>>>  br i1 undef, label %loop, label %exit
>>>
>>> exit:
>>>  ret void
>>>
>>> ; uselistorder directives ----
>>> ; uselistorder i64 %iv, {2, 1, 0}
>>> }
>>> —
>
> … snip
>
>>>  So, to get back to the original questions:
>>> 1) What exactly is IVUsers supposed to be finding? For instance, in 
>>> the example above, what would be the ideal/correct set of IVUsers 
>>> that the analysis should be finding?
>>
>> I don't have a good answer to this question, other than the obvious 
>> one (that it's supposed to find all values that have interesting SCEV 
>> expressions making use of the induction variable). Interesting here 
>> means that it's an affine addrec or has a starting value that is one 
>> (recursively). It also helps support post-inc transformations.
>>
>> The problem here is that we're trying to avoid calling getSCEV on all 
>> instructions just to see if they end up being an addrec of the given 
>> loop. Maybe we should do this in two steps? First, walk the users to 
>> find the instruction on which to call getSCEV. Then, go through the 
>> instructions in BB order, calling getSCEV on those identified 
>> instructions.
>>
>
> I’m not sure, either. What it looks like, just from reading and 
> interpreting the implementation, is that it’s looking for the SCEVs of 
> instructions that terminate a def-use chain that starts at a 
> loop-header phi. There are are a few criteria for what constitutes 
> “terminating” a def-use chain, but the most fundamental one (I think) 
> appears to be that a user of the SCEV isn’t itself an “interesting” SCEV.
>
>>>
>>> 2) Is it acceptable that there is this sort of difference in the 
>>> IVUsers analysis results based on nothing more than instruction or 
>>> use list ordering? I personally hope not; it has been mildly 
>>> infuriating having to narrow down on a bug with this difference in 
>>> place.
>>
>> Technically speaking, the dependency on the order of the phis is okay 
>> (i.e., it's possible they'll be no good way to avoid that). Not 
>> having it is clearly better. The dependency on the use-list ordering 
>> is highly discouraged. As you've noticed, this makes problems very 
>> hard to track down.
>>
>> Part of the problem here may be that, because what SCEV proves, and 
>> thus returns, is dependent on what SCEV's have been previously 
>> constructed (unfortunately), it's not hard to develop these kinds of 
>> processing-order dependencies with analyses that use SCEV.
>>
>>  -Hal
>>
>
>  Thankfully, it looks like the SCEVs that are produced for each 
> instruction in the DFT that I’m looking at are consistent; there 
> doesn’t seem to be any affect on the SCEVs themselves as a result of 
> the traversal orders.
>
>  It looks to me that there are two pieces of the implementation of 
> IVUsers that are leading to the fragility (i.e. dependence on input 
> ordering) that I am seeing.
> A) The inclusion of Processed.count(User) as a condition of the two if 
> statements at approx lines 235 & 241 in IVUsers::AddUsersImpl() of 
> lib/Analysis/IVUsers.cpp.
> B) It doesn’t terminate a DFT if it encounters a phi node in the 
> loop-header that it hasn’t seen before.
>
>  To help illustrate these, here are the relevant DFT traces from the 
> sample IR that I provided in the original post. (1) is the IR as-is, 
> and (2) is with the uselistorder instruction active.
>
> DFT (1)
> %iv = phi [%v1], [%iv.inc]  (ret true)
> SCEV: {%v1,+,1}<%loop>
>    * User: %iv.inc = add %iv, 1 (ret true)
>            SCEV: {(1 + %v1),+,1}<%loop>
>       * User: store %iv.inc, %addr  (ret false) **** ADD as user of 
> %iv.inc def ****
>       * User: %iv = phi [%v1], [%iv.inc] (PHI already processed)
>    *  User: %3 = sub 0, %iv (ret true)
>             SCEV: {(-1 * %v1),+,-1}<%loop>
>       * User: %4 = trunc %3 (ret true)
>               SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
>          * User: %5 = sub %1, %4 (ret true)
>                  SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * 
> (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) 
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
>             * User: %iv2 = phi [%v2], [%5] (ret true)
>                     SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) 
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
>                * User: %1 = sub %iv2, %0 (ret true)
>                        SCEV: {((-1 * (trunc i64 %v1 to i32)) + 
> %v2),+,(-1 + (-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 
> %v1 to i32)))}<%loop>
>                   * User: %5 = sub %1, %4 (already processed) **** ADD 
> as user of %1 def ****
>                   * User: %2 = sitofp %1 (ret false) **** ADD as user 
> of %1 def ****
>    * User: %0 = trunc %iv (ret true)
>            SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
>       * User: %1 = sub %iv2, %0 (already processed) **** ADD as user 
> of %0 def ****
>
> IV Users for loop %loop:
>   %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* 
> %addr, align 8
>   %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %5 
> = sub i32 %1, %4
>   %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 
> = sitofp i32 %1 to double
>   %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in    %1 = sub i32 %iv2, %0
>
> DFT (2)
> %iv = phi [%v1], [%iv.inc] (ret true)
> SCEV: {%v1,+,1}<%loop>
>    * User: %0 = trunc %iv (ret true)
>            SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
>       * User: %1 = sub %iv2, %0 (ret true)
>               SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 
> * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
>          * User: %5 = sub %1, %4 (ret true)
>                  SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * 
> (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) 
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
>             * User: %iv2 = phi [%v2], [%5] (ret true)
>                     SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) 
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
>                * User: %1 = sub %iv2, %0 (already processed) **** ADD 
> as user of %iv2 def ****
>          * User: %2 = sitofp %1 (ret false) **** ADD as user of %1 def 
> ****
>    * User: %3 = sub 0, %iv (ret true)
>            SCEV: {(-1 * %v1),+,-1}<%loop>
>       * User: %4 = trunc %3 (ret true)
>               SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
>          * User: %5 = sub %1, %4 (already processed) **** ADD as user 
> of %4 def ****
>    * User: %iv.inc = add %iv, 1 (ret true)
>            SCEV: {(1 + %v1),+,1}<%loop>
>       * User: store %iv.inc, %addr (ret false) **** ADD as user of 
> %iv.inc ****
>       * User: %iv = phi [%v1], [%iv.inc] (PHI already processed)
>
> IV Users for loop %loop:
>   %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc 
> i64 %v1 to i32)))}<%loop> in    %1 = sub i32 %iv2, %0
>   %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 
> = sitofp i32 %1 to double
>   %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in  %5 = sub i32 %1, %4
>   %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* 
> %addr, align 8
>
>
>  For point (A), first take a close look at what the DFT does when it 
> encounters the def of %1 as a User in DFT (1). The first time that 
> %1’s def is encountered is as a user of %iv2. We continue traversing 
> through the def of %1 at this point to process the users of it because 
> we haven’t encountered the def of %1 yet (i.e. Processed.count(%1) is 
> 0). However, we encounter the def of %1 again in this same DFT later 
> as a user of the def of %0. This second time that we encounter %1 we 
> don’t traverse its users because we’ve seen %1 before (i.e. 
> Processed.count(%1) is 1), and so we add the SCEV of %0 to the IVUsers 
> set. In DFT (2) we encounter the def of %1 first as a user of %0 — 
> which allows the DFT to process the users of %1 instead of adding the 
> SCEV of %0 to the IVUsers set.
>
>  For point (B), take a look at the traversal order in DFT (1). You’ll 
> see the chain %iv -> %3 -> %4 -> %5 -> %iv2 -> %1 -> %5. The def of %5 
> appears twice in the same def-use chain, and the second time we see it 
> will be the same situation as in point (A). The reason that we revisit 
> %5 in the same def-use chain is that we allow the DFT to continue 
> through %iv2 (%iv2 is a phi node in the loop-header) that will, in 
> effect, allow the DFT to loop-around and revisit. The result is adding 
> the def of %5 as a user of %1’s SCEV in DFT (1). We don’t do the same 
> in DFT (2) because we don’t visit the def of %5 in the same way. 
> Instead, in DFT (2) the def of %1 ends up being the value that appears 
> twice in a def-use chain.
>
>  What I *think* would be proper for IVUsers is to have 
> IVUsers::AddUserImpl() :
>
> A) Memoize its result for each given instruction instead of having the 
> Processed.count(User) condition at lines 235 & 241. The presence of 
> Processed.count(User) in the conditions at lines 235 & 241 basically 
> has the effect of changing the return value of AddUsersImpl() for 
> “interesting” instructions from true to false, which results in 
> different behaviour depending on the order in which instructions are 
> visited. Interestingly, if we don’t memoize the result of 
> AddUsersImpl() at all, but instead just remove the 
> Processed.count(User) parts of these two if statements, then the 
> return value of AddUsersImpl() will *always* be true and we would have 
> consistent results for recursive calls on “interesting” SCEVS, but 
> inconsistent results for recursive calls on instructions that aren’t 
> interesting.
>
> B) Similar to the condition at line 212, don’t let the DFT continue 
> into Users that are phi nodes in the loop-header. We’re going to visit 
> every phi node in the loop-header as the root of a DFT, in turn, 
> anyways, so this just prevents the possibility of revisiting the same 
> instruction multiple times in the same def-use chain.

These seems sensible.

>
>  If I mock these two changes up, the IVUsers sets in all three of my 
> original situations (IR as-is, IR with header-phi’s rearranged, and IR 
> with use-list ordering) all produce the exact same set of IVUsers. 
> However, the set of IVUsers ends up being a subset of what we 
> previously had. I don’t know if this is correct because the IVUsers 
> analysis doesn’t appear to be well-defined with respect to what it 
> should be generating. Specifically, with this test we end up with 
> slight variations of this DFT:
>
> %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ]
> SCEV: {%v1,+,1}<%loop>
>    * User: %iv.inc = add i64 %iv, 1 (ret true)
>            SCEV: {(1 + %v1),+,1}<%loop>
>       * User: store i64 %iv.inc, i64* %addr, align 8. (ret false) **** 
> ADD as user of %iv.inc def ****
>       * User: %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ] 
> (already processed PHI, skip)
>    * User: %3 = sub i64 0, %iv (ret true)
>            SCEV: {(-1 * %v1),+,-1}<%loop>
>       * User: %4 = trunc i64 %3 to i32 (ret true)
>               SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
>          * User: %5 = sub i32 %1, %4 (ret true)
>                  SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * 
> (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) 
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
>             * User: %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ] (PHI 
> in loop header, skip)
>    * User: %0 = trunc i64 %iv to i32 (ret true)
>            SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
>       * User: %1 = sub i32 %iv2, %0
>               SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 
> * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
>          * User: %5 = sub i32 %1, %4 (memoized - ret true)
>          * User: %2 = sitofp i32 %1 to double (ret false) **** ADD as 
> user of %1 def ****
> %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ]
> SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 
> %v1 to i32)))}<%loop>
>    * User: %1 = sub i32 %iv2, %0 (memoized - ret true)
>
> IVUsers:
>   %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* 
> %addr, align 8
>   %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 
> = sitofp i32 %1 to double
>
>
>  This looks sensible to me, but, again, I don’t really have a good 
> sense of what the “proper” IVUsers results are expected to be.

Compared to the previous results, this is missing:

   %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in    %1 = sub i32 %iv2, %0
   %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in    %5 = sub i32 
%1, %4
   %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc 
i64 %v1 to i32)))}<%loop> in    %1 = sub i32 %iv2, %0

Can you comment on why?

Thanks again,
Hal

>
>
>  Thoughts anyone?
>
> Thanks,
>  Daniel
>
> ---
> Daniel Neilson, Ph.D.
> Azul Systems
>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170914/4eb60399/attachment.html>