[llvm] r184684 - LoopVectorize: Add utility class for checking dependency among accesses

Preston Briggs preston.briggs at gmail.com
Mon Jul 1 13:43:54 PDT 2013


Hi,

Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> I have taken a high-level look at the implementation of the Dependence
Analysis pass.

Thanks.  It's a sizable chunk of code and I appreciate your effort.



> - Using GetElementPtr during the analysis.
>
>   Part of the current analysis depends on two geps with matching pointer
types.

That's not exactly correct.

   - If I find a pair of references that have GEPs that look the same, then
   I take advantage of them to find the separate indices (I recognize that you
   complain about this idea later on).
   - If one of the references lacks a GEP or the GEPs have different types,
   then I use the underlying SCEVs for all the analysis.

If we immediately jump to using SCEVs, we are essentially linearizing all
array accesses. While it won't matter for the 1-D cases, we're throwing
away information that could help in multidimensional cases. Sometimes we
can delinearize automagically, but I have, in the past, convinced myself
that this is inadequate to handle the general case.

> I don’t think this is the right approach. Two differently typed
GetElementPtr’s can compute the same access function.

Right. When faced with differently typed GEPs, I'd always use the SCEVs.





> - A GetElementPtr used to describe array accesses does not impose array
dimension restrictions.

Right. I don't look at the type of the GEP at all, so I'm not trying to get
dimension info there (or anywhere else, for that matter).

>   The code currently assumes that two different indices of a
GetElementPointer can be independently analyzed.

Yes, this this is exactly how I'm using GEPs and its the only thing I try
to glean from GEPs.

> This is not correct.

OK, so let's focus on this.

> The address part computation of a higher index may “overflow" into the
lower index. An array type in a gep does not restrict the index range. (
http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds,
Only the type of the array elements is relevant for the address computation)
>   If we want to use a “multi-dimensional” array property (indices can be
independently analyzed) we have to first show that this holds for the
LLVM-IR in question. In my example below we have to make sure that N < 256,
otherwise, we have to analyze the indices together.
>
>   Let me give an example:
>     void f(int A[256][256], long N) {
>     for (long y = 0; y < 128; ++y)
>       for (long i = 0; i < N; ++i)
>         A[y][i+N] = 2 * A[y][i];
>     }
>
> [...]
>
>   If N is big enough (>=256) there is a dependence between the accesses
(it might not be valid C to have N > 255, but it is certainly valid in LLVM
IR semantics). The current implementation treats different getelementptr
indices as independent and will return “none” as dependence answer for the
two accesses. This is not correct.

So it sounds like we'll need to disable the little bit of code that makes
this assumption and take our lumps with linearizing everything. We might
try writing a delinearizer following Maslov (Hal Finkel also had ideas
worth visiting).

Alternatively, we might revisit the definition of GEPs, looking for an
alternative that lets a C or Fortran front end express multidimensional
array references without linearizing everything.



> - Overflow
> It seems the current implementation does not handle overflow correctly.

I can believe it. I think this whole question needs to be carefully
discussed and reviewed.
I certainly don't understand all the constraints.

> We must be very careful with cases where part of the access function
might overflow.
>
> ;;  for (long unsigned i = 0; i < N; i++) {
> ;;    A[3*i+7] = i;
> ;;    *B++ = A[3*i];
>
> There is a dependence between the two access possible due to integer
wrapping
> but the current implementation returns there is none. I have not
investigated why.

At a glance, I would think there's no possible dependence,
since GCD(3, 3) => 3 which doesn't divide 7.
That's as far as the current analysis will go.

But of course your point is correct.
If I work through what happens with 4-bit words,
we see things like (3*i + 7) % 16 == (3*j)  % 16
when i == 0 and j == 13.

Seems a fatally hard problem to me.
Any ideas before I give up in despair?


> Thanks for pushing LLVM on this front!

Thanks again for the feedback,
Preston
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130701/6ce3b0c7/attachment.html>


More information about the llvm-commits mailing list