[llvm-dev] How to get information about data dependencies?

Tue Jul 7 12:11:07 PDT 2020

  > Just to be clear, you don't think it has potential because it's been
> disgned into an inner-loop corner and would take extensive rewriting to
> handle OLV?

First of all, if one wants to do dependence analysis for OLV (which is what
I'm currently working on), they may get away
with extending LAA, or writing their own OLV checker.

That said, no it doesn't seem that the current LAA can help. But that we
can't support OLV currently
is not the core problem for me. We may do it eventually, but if the
implementation is still hacky, then
it would still be bad I think. The core problem is that there's no clear
theoretical foundation to support it.

The code seems to me like a set of "Oh, we needed to handle this case so we
added an `if` there".
Which to me is not the correct way to go and it's pretty much the opposite
of DA.

(Again, I don't mean to criticize LAA implementors - they probably had
their reasons and I guess
they know way more than me)

> it certainly seems like extending DA is the way to go

Maybe but maybe not. Maybe a good theory for run-time checks ends up
being different than DA's ideas (this is partly what I have to do this
summer).
In any case, extending

> but I'd like to hear from the current vectorizer
> maintainers because I don't have enough knowledge to make an informed
> judgment.

Me too!

> There's the VPlan infrastructure which I have not heard much about for
> several months.  What is going on with that?  Yes, that's a vector
> codegen issue but it may be useful to have a more complete picture of
> how all this works or will work.

Sorry, I don't know much about VPlan. I'm involved in the RV (
https://github.com/cdl-saarland/rv)

> Note that when the development of LAA started it also did static checks
only, even though DA already existed in the code base

Interesting, thanks.

> Thanks for sharing your analysis.

No problem :)

> I am not sure if that is an entirely fair characterization of LAA. LAA is
being used by the vectorizer (and other passes) in production for a few
years now. None of the in-tree users of DA seem to be enabled by default
and therefore LAA probably has an order of magnitude more testing, bug
fixes & tuning.

No argument there. The fact that it is used, doesn't necessarily mean it's
clean nor that it has some strong theory supporting it.

> DA’s implementation might be cleaner, but as mentioned earlier, DA
handles only a small subset of things LAA handles and hence I am not sure
comparing the code-complexity is too helpful.

DA does not handle a small subset of LAA's checks, unless I miss something.
It handles way more when it comes to static checking.
I think that comparing code complexity is important. DA is about double the
size of LAA yet it's way more understandable. And the reason for that
I don't think it is that it does something more trivial. Rather, it's based
on a clear paper and has clearly implemented it.

> IMO a lot of LAA complexity comes from things DA does not handle, in
particular runtime check generation.

I agree.

> LAA also analyses & processes a whole loop whereas DA only checks
dependences between 2 memory accesses, as well as decides whether it is
profitable to generate runtime checks.

It processes innermost loops only and the fact that it can handle a whole
loop rather than independent accesses I'm not sure it is a good path. For
LAA's usage it's necessary but it creates a form of coupling (and
complexity).

>  There is definitely potential for improving the structure & organization
of LAA, as well as improving the documentation. Happy to collaborate on
that.

Are we really sure of that? Personally, I was thinking of submitting a
patch but I'm not sure it is worth the effort. However, I'm glad to hear
that you're happy to collaborate. :)
We can talk about that more if you want.

> I am not convinced it makes sense to add runtime check generating to DA
directly, because I don’t think the static dependence checks really need to
be strongly coupled with runtime-check generati

I agree to the latter, maybe to the former. In any case, I'd like to see
the current DA staying as it is. And move the discussion to "what is the
future of run-time checks". Either that
is extending LAA, DA or something else completely.

> To clarify, LAA does static checks and only generate runtime checks if it
cannot prove that the dependence is safe for vectorization statically.
Granted, the static checks mostly boil down to distance computations on
SCEV expressions, but for the current use cases it seems to work well
enough.

Yes, sorry for not stressing that LAA does static checks too as I said
though, in my understanding they're very weak, though still enough for its
usage. And I agree that LAA's capabilities is probably enough for innermost
loop vectorization.
The important thing I believe is the future.

> It might be feasible to use DA for the static checks in LAA. That might
help for a few multi-dimensional cases, but in practice generating proper
runtime-checks for multi-dimensional cases is probably more important, due
to aliasing issues.

Well... I tried that and it doesn't seem to be very useful unfortunately.
The C/C++ way that arrays are defined is probably why DA is not that
useful. Namely that a row can alias with another row in 2D arrays. The
theory behind DA
is quite powerful if we knew that they don't alias. Right now, it just
gives up.

I don't think that LAA can handle multi-dimensional cases either though,
nor do I have a good idea about how to do it myself (in or out of LAA / DA).

Best,
Stefanos

Στις Τετ, 8 Ιουλ 2020 στις 12:48 π.μ., ο/η Florian Hahn <
florian_hahn at apple.com> έγραψε:

> Hi,
>
> > On Jul 7, 2020, at 18:37, Stefanos Baziotis via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > >  Ah, that's important information I didn't have.  Thank you!
> >
> > No problem, glad to help!
> >
> > To the rest of your thoughts, I certainly agree. One interesting
> question is why LAA
> > didn't use DA at all. Other than that, note that LAA is quite
> specialized, namely for
> > loop vectorization. Actually, it's even more specific. For innermost
> loop vectorization.
> > That affects the design. It might had been easier to create this
> specialized tool than
> > extending a general one (if that was a good path to follow is another
> topic).
> >
> > > But yet they are intimately related in that the kind of information you
> > > want to know statically and dynamically is the same.  I wonder what it
> > > would take to extend DA to generate runtime checks if it can't prove
> > > independence.
> >
> > Indeed, but again, IMHO unifying them is neither easy nor does it make
> sense.
> > They do fundamentally the same thing but their directions are very
> different.
> >
> > So, I see two options:
> >
> > a) As you said
> >
> > >  I wonder what it would take to extend DA to generate runtime checks
> if it can't prove independence.
> >
> > Personally, I see potential but neither do I know what it would take.
> Since this is something that I'm
> > currently thinking of, I would be more than interested to discuss it
> extensively.
> >
> > In any case, I would strongly prefer that we don't follow the LAA path,
> since I don't think it has potential
> > anyway. I think that we should try to find a way to extend it that is
> also based on strong theoretical foundation
> > and maintains the high quality of code.
>
> I am not sure if that is an entirely fair characterization of LAA. LAA is
> being used by the vectorizer (and other passes) in production for a few
> years now. None of the in-tree users of DA seem to be enabled by default
> and therefore LAA probably has an order of magnitude more testing, bug
> fixes & tuning.
>
> DA’s implementation might be cleaner, but as mentioned earlier, DA handles
> only a small subset of things LAA handles and hence I am not sure comparing
> the code-complexity is too helpful.
> IMO a lot of LAA complexity comes from things DA does not handle, in
> particular runtime check generation. LAA also analyses & processes a whole
> loop whereas DA only checks dependences between 2 memory accesses, as well
> as decides whether it is profitable to generate runtime checks.
>
> There is definitely potential for improving the structure & organization
> of LAA, as well as improving the documentation. Happy to collaborate on
> that.
>
> I am not convinced it makes sense to add runtime check generating to DA
> directly, because I don’t think the static dependence checks really need to
> be strongly coupled with runtime-check generation.
>
> > b) Extend LAA to do static checks
> >
> > The question here is though: Why do that? As I said, it doesn't seem to
> have potential and I believe that people
> > working on vectorizers (either LLVM's current one or external like e.g.
> RV and VPlan) don't do either.
> >
>
> To clarify, LAA does static checks and only generate runtime checks if it
> cannot prove that the dependence is safe for vectorization statically.
> Granted, the static checks mostly boil down to distance computations on
> SCEV expressions, but for the current use cases it seems to work well
> enough.
>
> It might be feasible to use DA for the static checks in LAA. That might
> help for a few multi-dimensional cases, but in practice generating proper
> runtime-checks for multi-dimensional cases is probably more important, due
> to aliasing issues.
>
> Cheers,
> Florian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200707/09acf263/attachment-0001.html>