[llvm-commits] dependence analysis

Wed Aug 1 21:21:06 PDT 2012

Hal,

Here's an updated patch file with all your recommended fixes,
plus some additional testing.

Thanks,
Preston

On Tue, Jul 31, 2012 at 3:52 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> On Tue, 31 Jul 2012 14:06:31 -0700
> Preston Briggs <preston.briggs at gmail.com> wrote:
>
>> Hal, thanks for all the comments.
>> How should I make/submit these sorts of changes?
>> Change my source and create a replacement patch file?
>
> Yep, just send an updated patch file in a reply.
>
>>
>>
>> On Tue, Jul 31, 2012 at 12:44 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> >
>> > Preston,
>> >
>> > Thanks! A few comments:
>> >
>> > > +    /// subtractForComparison - When comparing two SCEVs for
>> > > equality,
>> > > +    /// we subtract one from the other and test the difference
>> > > using isZero()
>> > > +    /// and isKnownNonZero(). Using ordinary subtraction for
>> > > SCEVs, we get
>> > > +    /// confused by sign- and zero-extensions, thinking that
>> > > expressions that
>> > > +    /// are clearly not equal might be equal.
>> > > subtractForComparison tries to
>> > > +    /// avoid this problem.
>> > > +    const SCEV *subtractForComparison(const SCEV *X, const SCEV
>> > > *Y) const;
>> >
>> > Should we be improving SCEV here? The comment makes this sound like
>> > a general problem.
>>
>> I think it's a weakness in the SCEVs.
>> Basically, I want to know if X and Y can be equal.
>> Apparently the way to do this is to find the difference,
>> then test it to see if it's known to be non-zero.
>> I asked about it earlier, here:
>> http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-June/050851.html
>> This is my hack at a solution, but it may be brittle
>> (in that it may not continue to be useful if people change the
>> implementation of SCEVs).
>>
>>
>> > > +    /// collectUpperBound - All subscripts are the same type (on
>> > > my machine,
>> > > +    /// an i64). The loop bound may be a smaller type.
>> > > collectUpperBound
>> >
>> > I don't know what your machine is ;) -- if you're going to mention
>> > this in the comment, say on x86_64/Linux (or whatever), for
>> > example, the following is true.
>>
>> Makes sense.
>>
>>
>> > > +// Right now, the implementation lacks an MIV test and cannot
>> > > propagate +// constraints between coupled RDIV subscripts. I'll
>> > > probably implement +// the Power Test as an MIV test.
>> >
>> > As a general rule, don't have comments that refer to yourself. I
>> > would phrase this as, "The Power Test can also be implemented as an
>> > MIV test".
>>
>> Sounds good.
>>
>>
>> > "cannot propagate constraints between coupled RDIV subscripts" -
>> > This yield only false positive dependencies, right?
>>
>> Right. The weakness is mine. The paper explains how to do this
>> particular trick in a fast and accurate fashion. I don't understand
>> it (yet), so the implementation
>> isn't quite as accurate as it should be. That is, it'll be
>> conservative in the presence
>> of coupled RDIV subscripts.
>>
>> When an MIV test gets written, it'll handled the coupled RDIV
>> subscripts, but (probably) at a greater compile-time cost. To an
>> extent, the entire Delta test
>> is about finding the easy cases and solving them quickly, without
>> having to fall back on a more expensive general approach like the
>> Power Test.
>>
>>
>> > > +// When the appropriate fix propagates into the release, replace
>> > > the various +// calls to APInt::sdiv and APInt::srem with
>> > > APInt::sdivrem, where appropriate. +//
>> >
>> > To which fix are you referring. To commit to trunk, you only need
>> > the fix in trunk ;)
>>
>> I found an error in the implementation of APInt::sdivrem.
>> I submitted a fix a while back (but after 3.1).\, hence my confusion.
>> I'll replace sdiv and srem pairs with sdivrem where appropriate.
>>
>>
>> > > +// In addition, the implementation depends on the GEP
>> > > instruction to +// differentiate subscripts. Unfortunately, since
>> > > Clang linearizes subscripts +// for most arrays, this will be
>> > > inadequate. We trust that the GEP instruction +// will eventually
>> > > be extended.
>> >
>> > I am not sure about your trust in this; we may need to (and, indeed,
>> > may want to) delinearize. We should probably discuss this in more
>> > detail in a different thread. For the purpose of commiting this
>> > code, I'd remove this comment.
>>
>> OK. Delinearization wil be fun to work on.
>> Also, a good MIV test will naturally attack many of these cases.
>> Long term, we should continue to press for an improved GEP,
>> 'cause it will give us the best results.
>>
>>
>> > > +// Used to test the dependence analyzer.
>> > > +// Looks through the function, noting the first store instruction
>> > > +// and the first load instruction
>> > > +// (which always follows the first load in our tests).
>> > > +// Calls depends() and prints out the result.
>> > > +// Ignores all other instructions.
>> > > +static
>> > > +void dumpExampleDependence(raw_ostream &OS, Function *F,
>> > > +                           DependenceAnalysis *DA) {
>> > > +  for (inst_iterator SrcI = inst_begin(F), SrcE = inst_end(F);
>> > > +       SrcI != SrcE; ++SrcI) {
>> > > +    if (const StoreInst *Src = dyn_cast<StoreInst>(&*SrcI)) {
>> > > +      for (inst_iterator DstI = SrcI, DstE = inst_end(F);
>> > > +           DstI != DstE; ++DstI) {
>> > > +        if (const LoadInst *Dst = dyn_cast<LoadInst>(&*DstI)) {
>> > > +          if (Dependence *D = DA->depends(Src, Dst, true)) {
>> > > +            D->dump(OS);
>> > > +            delete D;
>> > > +          }
>> > > +          else
>> > > +            OS << "none\n";
>> > > +          return;
>> > > +        }
>> > > +      }
>> > > +    }
>> > > +  }
>> > > +}
>> >
>> > I think that it would be better to print a prefix here, it will
>> > make it easier to grep the debug output, and will make sure that
>> > the FileCheck patterns don't accidentally match 'none' that occurs
>> > as a substring of something else (like the build path).
>>
>> Ah, I see. Good idea.
>>
>>
>> > > +// Returns true is a particular level is scalar; that is,
>> >
>> > Typo? (is -> if).
>>
>> Yes, a typo
>>
>>
>> > > +// Returns true if if splitting this loop will break the
>> > > dependence.
>> >
>> > Typo? (if if -> if)
>>
>> Yes, a typo.
>>
>>
>> > Also, would it make sense to provide a method (or set of methods)
>> > which help(s) with splitting the loop? I don't understand the
>> > workflow necessary to eliminate a dependence by splitting a loop
>> > using the information provided by this pass.
>>
>> Generally, I expect the dependence analyzer to be used to build
>> a dependence graph for a function (basically a map from instructions
>> to dependences). Looking for cycles in the graph shows us loops
>> that cannot be trivially vectorized/parallelized.
>>
>> We can try to improve the situation by examining all the dependences
>> that make up the cycle, looking for ones we can break.
>> Sometimes, peeling the first or last iteration of a loop will break
>> dependences, and I've got flags for those possibilities.
>> Sometimes, splitting a loop at some other iteration will do the trick,
>> and I've got a flag for that case. Rather than waste the space to
>> record the exact iteration (since we rarely know), I decided to
>> provide a method that calculates the iteration. It's a drag that it
>> must work from scratch, but wonderful in that it's possible.
>>
>> Here's an example:
>>
>>     for (i = 0; i < 1000; i++)
>>         A[i] = ...
>>         ... = A[1000 - i - 1]
>>
>> There's a loop-carried flow dependence from the store to the load,
>> found by the weak-crossing SIV test. The dependence will have a flag,
>> indicating that the dependence can be broken by splitting the loop.
>> Calling getSplitIteration will return 500.
>> Splitting the loop at the 500th iteration, thusly
>>
>>     for (i = 0; i < 500; i++)
>>         A[i] = ...
>>         ... = A[1000 - i - 1]
>>
>>     for (i = 500; i < 1000; i++)
>>         A[i] = ...
>>         ... = A[1000 - i - 1]
>>
>> breaks the dependence and allows us to vectorize/parallelize
>> both loops.
>>
>> Perhaps all this example should make it's way into a comment?
>
> Yes, I think that would be best.
>
>>
>>
>> > > +// These are simple getters that hide the implementation
>> > > +// and check to be sure they are used consistently.
>> >
>> > I'm not sure that this comment is necessary. If you wish to keep it,
>> > then (they are -> it is (implementation is singular)).
>>
>> "they" was supposed to refer to the getters,
>> but I'm happy to delete the comment.
>>
>>
>> > > llvm_unreachable("shouldn't reach the end of Constraint::dump");
>> >
>> > I would say, "Unknown constraint type in Constraint::dump"
>>
>> OK.
>>
>>
>> > > +  // if (X->isPoint() && Y->isPoint())  This case can't occur.
>> >
>> > Then it should be an assert (or removed). As a general rule, no
>> > commented-out code.
>>
>> I think the assert is better.
>> I'd like some marker left in place,
>> since the case is covered in Figure 4 from the paper.
>> Better still, I should add a comment arguing why it can't occur,
>> together with an assertion to the effect.
>>
>>
>> > > +    // yucko
>> > > +    return false;
>> >
>> > yucko?
>>
>> An audible sigh, that having done all this work,
>> we can't glean any useful information.
>> I'll make a better comment.
>>
>>
>> > > +// Scrub expressions early, to expose AddRecs.
>> > > +// Walk through expression looking for sign- and zero-extensions.
>> > > +// If found, push them down as far as possible, hopefully
>> > > +// eliminating them entirely.
>> > > +const SCEV *DependenceAnalysis::scrubExpression(const SCEV *E)
>> > > const {
>> >
>> > May this should be named exposeAddRecs? (I don't like
>> > scrubExpression b/c that does not really say anything about what it
>> > does).
>>
>> Sounds good.
>>
>>
>> > > +// Care is required to keep this code up to date w.r.t. the code
>> > > above.
>> >
>> > If this can't be refactored while still keeping the optimization,
>> > then you should at least also place warning comments in the "code
>> > above".
>>
>> The additional comment makes sense.
>> I'll think about the refactoring.
>>
>>
>> > > +  if (Coupled.count()) { // why doesn't Coupled.empty() work
>> > > correctly?
>> >
>> > What do you mean?
>>
>> I think there's an error in the implementation of empty().
>> In my tests, sometimes I saw Coupled.count() == 0
>> but Coupled.empty() == false.
>> I would think empty() would be slightly faster.
>>
>> I looked briefly for the problem, but didn't find it.
>> The comment is a reminder to myself to look harder.
>
> If you have a test case, then you could also file a bug report ;)
>
>  -Hal
>
>>
>> Thanks a lot,
>> Preston
>>
>>
>> > On Mon, 30 Jul 2012 17:48:51 -0700
>> > Preston Briggs <preston.briggs at gmail.com> wrote:
>> >
>> > > I've written a dependence analyzer for LLVM. It's a fairly
>> > > complete implementation of the paper
>> > >
>> > > Practical Dependence Testing
>> > > Gina Goff, Ken Kennedy, and Chau-Wen Tseng
>> > > PLDI 1991
>> > >
>> > >
>> > > It lacks an MIV test and cannot yet propagate constraints between
>> > > coupled RDIV subscripts (discussed in Section 5.3.2 of the
>> > > paper). I expect to add these in the future.
>> > >
>> > > It's organized as a FunctionPass with a single entry point that
>> > > supports testing for dependence between two instructions in a
>> > > function. If there's no dependence, it returns null. If there's a
>> > > dependence, it returns a pointer to a Dependence which can be
>> > > queried about details (what kind of dependence, is it loop
>> > > independent, direction and distance vector entries, etc). I
>> > > haven't included every imaginable feature, but there's a good
>> > > selection that should be adequate for supporting many loop
>> > > transformations. Of course, it can be extended as necessary.
>> > >
>> > > Included in the patch file are many test cases, commented with C
>> > > code showing the loops and array references.
>> > >
>> > > Preston
>> >
>> >
>> >
>> > --
>> > Hal Finkel
>> > Postdoctoral Appointee
>> > Leadership Computing Facility
>> > Argonne National Laboratory
>
>
>
> --
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: da.patch
Type: application/octet-stream
Size: 314791 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120801/c8ef16d4/attachment.obj>