[llvm-dev] IVUsers pass is fragile. Is this okay? How can it be resolved?
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Thu Sep 14 19:30:08 PDT 2017
On 09/14/2017 10:43 AM, Daniel Neilson wrote:
> Thank you for your thoughts, Hal. More information below...
>
>> On Sep 13, 2017, at 5:43 PM, Hal Finkel <hfinkel at anl.gov
>> <mailto:hfinkel at anl.gov>> wrote:
>> On 09/13/2017 01:01 PM, Daniel Neilson via llvm-dev wrote:
>
> … snip
>
>>> For example, the following IR will produce different sets of IV
>>> users if either:
>>> i) The order of the PHI nodes in the %loop block are reordered; or
>>> ii) The uselistorder directive is uncommented
>>>
>>> ---
>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
>>> target triple = "x86_64-unknown-linux-gnu"
>>>
>>> define void @test(i64 %v1, i32 %v2, i64* %addr) {
>>> entry:
>>> br label %loop
>>>
>>> loop:
>>> %iv = phi i64 [%v1, %entry], [%iv.inc, %loop]
>>> %iv2 = phi i32 [%v2, %entry], [%5, %loop]
>>> %0 = trunc i64 %iv to i32
>>> %1 = sub i32 %iv2, %0
>>> %2 = sitofp i32 %1 to double
>>> %3 = sub i64 0, %iv
>>> %4 = trunc i64 %3 to i32
>>> %5 = sub i32 %1, %4
>>> %iv.inc = add i64 %iv, 1
>>> store i64 %iv.inc, i64* %addr, align 8
>>> br i1 undef, label %loop, label %exit
>>>
>>> exit:
>>> ret void
>>>
>>> ; uselistorder directives ----
>>> ; uselistorder i64 %iv, {2, 1, 0}
>>> }
>>> —
>
> … snip
>
>>> So, to get back to the original questions:
>>> 1) What exactly is IVUsers supposed to be finding? For instance, in
>>> the example above, what would be the ideal/correct set of IVUsers
>>> that the analysis should be finding?
>>
>> I don't have a good answer to this question, other than the obvious
>> one (that it's supposed to find all values that have interesting SCEV
>> expressions making use of the induction variable). Interesting here
>> means that it's an affine addrec or has a starting value that is one
>> (recursively). It also helps support post-inc transformations.
>>
>> The problem here is that we're trying to avoid calling getSCEV on all
>> instructions just to see if they end up being an addrec of the given
>> loop. Maybe we should do this in two steps? First, walk the users to
>> find the instruction on which to call getSCEV. Then, go through the
>> instructions in BB order, calling getSCEV on those identified
>> instructions.
>>
>
> I’m not sure, either. What it looks like, just from reading and
> interpreting the implementation, is that it’s looking for the SCEVs of
> instructions that terminate a def-use chain that starts at a
> loop-header phi. There are are a few criteria for what constitutes
> “terminating” a def-use chain, but the most fundamental one (I think)
> appears to be that a user of the SCEV isn’t itself an “interesting” SCEV.
>
>>>
>>> 2) Is it acceptable that there is this sort of difference in the
>>> IVUsers analysis results based on nothing more than instruction or
>>> use list ordering? I personally hope not; it has been mildly
>>> infuriating having to narrow down on a bug with this difference in
>>> place.
>>
>> Technically speaking, the dependency on the order of the phis is okay
>> (i.e., it's possible they'll be no good way to avoid that). Not
>> having it is clearly better. The dependency on the use-list ordering
>> is highly discouraged. As you've noticed, this makes problems very
>> hard to track down.
>>
>> Part of the problem here may be that, because what SCEV proves, and
>> thus returns, is dependent on what SCEV's have been previously
>> constructed (unfortunately), it's not hard to develop these kinds of
>> processing-order dependencies with analyses that use SCEV.
>>
>> -Hal
>>
>
> Thankfully, it looks like the SCEVs that are produced for each
> instruction in the DFT that I’m looking at are consistent; there
> doesn’t seem to be any affect on the SCEVs themselves as a result of
> the traversal orders.
>
> It looks to me that there are two pieces of the implementation of
> IVUsers that are leading to the fragility (i.e. dependence on input
> ordering) that I am seeing.
> A) The inclusion of Processed.count(User) as a condition of the two if
> statements at approx lines 235 & 241 in IVUsers::AddUsersImpl() of
> lib/Analysis/IVUsers.cpp.
> B) It doesn’t terminate a DFT if it encounters a phi node in the
> loop-header that it hasn’t seen before.
>
> To help illustrate these, here are the relevant DFT traces from the
> sample IR that I provided in the original post. (1) is the IR as-is,
> and (2) is with the uselistorder instruction active.
>
> DFT (1)
> %iv = phi [%v1], [%iv.inc] (ret true)
> SCEV: {%v1,+,1}<%loop>
> * User: %iv.inc = add %iv, 1 (ret true)
> SCEV: {(1 + %v1),+,1}<%loop>
> * User: store %iv.inc, %addr (ret false) **** ADD as user of
> %iv.inc def ****
> * User: %iv = phi [%v1], [%iv.inc] (PHI already processed)
> * User: %3 = sub 0, %iv (ret true)
> SCEV: {(-1 * %v1),+,-1}<%loop>
> * User: %4 = trunc %3 (ret true)
> SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
> * User: %5 = sub %1, %4 (ret true)
> SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *
> (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32))
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
> * User: %iv2 = phi [%v2], [%5] (ret true)
> SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32))
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
> * User: %1 = sub %iv2, %0 (ret true)
> SCEV: {((-1 * (trunc i64 %v1 to i32)) +
> %v2),+,(-1 + (-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
> %v1 to i32)))}<%loop>
> * User: %5 = sub %1, %4 (already processed) **** ADD
> as user of %1 def ****
> * User: %2 = sitofp %1 (ret false) **** ADD as user
> of %1 def ****
> * User: %0 = trunc %iv (ret true)
> SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
> * User: %1 = sub %iv2, %0 (already processed) **** ADD as user
> of %0 def ****
>
> IV Users for loop %loop:
> %iv.inc = {(1 + %v1),+,1}<%loop> in store i64 %iv.inc, i64*
> %addr, align 8
> %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %5
> = sub i32 %1, %4
> %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %2
> = sitofp i32 %1 to double
> %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in %1 = sub i32 %iv2, %0
>
> DFT (2)
> %iv = phi [%v1], [%iv.inc] (ret true)
> SCEV: {%v1,+,1}<%loop>
> * User: %0 = trunc %iv (ret true)
> SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
> * User: %1 = sub %iv2, %0 (ret true)
> SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1
> * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
> * User: %5 = sub %1, %4 (ret true)
> SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *
> (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32))
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
> * User: %iv2 = phi [%v2], [%5] (ret true)
> SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32))
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
> * User: %1 = sub %iv2, %0 (already processed) **** ADD
> as user of %iv2 def ****
> * User: %2 = sitofp %1 (ret false) **** ADD as user of %1 def
> ****
> * User: %3 = sub 0, %iv (ret true)
> SCEV: {(-1 * %v1),+,-1}<%loop>
> * User: %4 = trunc %3 (ret true)
> SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
> * User: %5 = sub %1, %4 (already processed) **** ADD as user
> of %4 def ****
> * User: %iv.inc = add %iv, 1 (ret true)
> SCEV: {(1 + %v1),+,1}<%loop>
> * User: store %iv.inc, %addr (ret false) **** ADD as user of
> %iv.inc ****
> * User: %iv = phi [%v1], [%iv.inc] (PHI already processed)
>
> IV Users for loop %loop:
> %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc
> i64 %v1 to i32)))}<%loop> in %1 = sub i32 %iv2, %0
> %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %2
> = sitofp i32 %1 to double
> %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in %5 = sub i32 %1, %4
> %iv.inc = {(1 + %v1),+,1}<%loop> in store i64 %iv.inc, i64*
> %addr, align 8
>
>
> For point (A), first take a close look at what the DFT does when it
> encounters the def of %1 as a User in DFT (1). The first time that
> %1’s def is encountered is as a user of %iv2. We continue traversing
> through the def of %1 at this point to process the users of it because
> we haven’t encountered the def of %1 yet (i.e. Processed.count(%1) is
> 0). However, we encounter the def of %1 again in this same DFT later
> as a user of the def of %0. This second time that we encounter %1 we
> don’t traverse its users because we’ve seen %1 before (i.e.
> Processed.count(%1) is 1), and so we add the SCEV of %0 to the IVUsers
> set. In DFT (2) we encounter the def of %1 first as a user of %0 —
> which allows the DFT to process the users of %1 instead of adding the
> SCEV of %0 to the IVUsers set.
>
> For point (B), take a look at the traversal order in DFT (1). You’ll
> see the chain %iv -> %3 -> %4 -> %5 -> %iv2 -> %1 -> %5. The def of %5
> appears twice in the same def-use chain, and the second time we see it
> will be the same situation as in point (A). The reason that we revisit
> %5 in the same def-use chain is that we allow the DFT to continue
> through %iv2 (%iv2 is a phi node in the loop-header) that will, in
> effect, allow the DFT to loop-around and revisit. The result is adding
> the def of %5 as a user of %1’s SCEV in DFT (1). We don’t do the same
> in DFT (2) because we don’t visit the def of %5 in the same way.
> Instead, in DFT (2) the def of %1 ends up being the value that appears
> twice in a def-use chain.
>
> What I *think* would be proper for IVUsers is to have
> IVUsers::AddUserImpl() :
>
> A) Memoize its result for each given instruction instead of having the
> Processed.count(User) condition at lines 235 & 241. The presence of
> Processed.count(User) in the conditions at lines 235 & 241 basically
> has the effect of changing the return value of AddUsersImpl() for
> “interesting” instructions from true to false, which results in
> different behaviour depending on the order in which instructions are
> visited. Interestingly, if we don’t memoize the result of
> AddUsersImpl() at all, but instead just remove the
> Processed.count(User) parts of these two if statements, then the
> return value of AddUsersImpl() will *always* be true and we would have
> consistent results for recursive calls on “interesting” SCEVS, but
> inconsistent results for recursive calls on instructions that aren’t
> interesting.
>
> B) Similar to the condition at line 212, don’t let the DFT continue
> into Users that are phi nodes in the loop-header. We’re going to visit
> every phi node in the loop-header as the root of a DFT, in turn,
> anyways, so this just prevents the possibility of revisiting the same
> instruction multiple times in the same def-use chain.
These seems sensible.
>
> If I mock these two changes up, the IVUsers sets in all three of my
> original situations (IR as-is, IR with header-phi’s rearranged, and IR
> with use-list ordering) all produce the exact same set of IVUsers.
> However, the set of IVUsers ends up being a subset of what we
> previously had. I don’t know if this is correct because the IVUsers
> analysis doesn’t appear to be well-defined with respect to what it
> should be generating. Specifically, with this test we end up with
> slight variations of this DFT:
>
> %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ]
> SCEV: {%v1,+,1}<%loop>
> * User: %iv.inc = add i64 %iv, 1 (ret true)
> SCEV: {(1 + %v1),+,1}<%loop>
> * User: store i64 %iv.inc, i64* %addr, align 8. (ret false) ****
> ADD as user of %iv.inc def ****
> * User: %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ]
> (already processed PHI, skip)
> * User: %3 = sub i64 0, %iv (ret true)
> SCEV: {(-1 * %v1),+,-1}<%loop>
> * User: %4 = trunc i64 %3 to i32 (ret true)
> SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
> * User: %5 = sub i32 %1, %4 (ret true)
> SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *
> (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32))
> + (-1 * (trunc i64 %v1 to i32)))}<%loop>
> * User: %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ] (PHI
> in loop header, skip)
> * User: %0 = trunc i64 %iv to i32 (ret true)
> SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
> * User: %1 = sub i32 %iv2, %0
> SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1
> * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
> * User: %5 = sub i32 %1, %4 (memoized - ret true)
> * User: %2 = sitofp i32 %1 to double (ret false) **** ADD as
> user of %1 def ****
> %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ]
> SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
> %v1 to i32)))}<%loop>
> * User: %1 = sub i32 %iv2, %0 (memoized - ret true)
>
> IVUsers:
> %iv.inc = {(1 + %v1),+,1}<%loop> in store i64 %iv.inc, i64*
> %addr, align 8
> %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64
> (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %2
> = sitofp i32 %1 to double
>
>
> This looks sensible to me, but, again, I don’t really have a good
> sense of what the “proper” IVUsers results are expected to be.
Compared to the previous results, this is missing:
%0 = {(trunc i64 %v1 to i32),+,1}<%loop> in %1 = sub i32 %iv2, %0
%4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in %5 = sub i32
%1, %4
%iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc
i64 %v1 to i32)))}<%loop> in %1 = sub i32 %iv2, %0
Can you comment on why?
Thanks again,
Hal
>
>
> Thoughts anyone?
>
> Thanks,
> Daniel
>
> ---
> Daniel Neilson, Ph.D.
> Azul Systems
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170914/4eb60399/attachment.html>
More information about the llvm-dev
mailing list